Non-overlapping ranges in sequential data
From Hashmysql
Consider the following structure:
CREATE TABLE `t1` ( `id` int(11) DEFAULT NULL, `v` int(11) DEFAULT NULL ) ENGINE=InnoDB;
And the following data:
INSERT INTO `t1` VALUES (1,1),(1,2),(2,3),(2,4),(2,5),(1,6),(1,7);
+----+---+ | id | v | +----+---+ | 1 | 1 | | 1 | 2 | | 2 | 3 | | 2 | 4 | | 2 | 5 | | 1 | 6 | | 1 | 7 | +----+---+
hash-mysql regulars inquired if it would be possible to extract sequential non-overlapping ranges, in relation for each `id`. Here is a sample output, based on the test data:
+----+----------+----------+ | id | min(a.v) | max(a.v) | +----+----------+----------+ | 1 | 1 | 2 | | 2 | 3 | 5 | | 1 | 6 | 7 | +----+----------+----------+
A proposed solution (thanks Jon!) would be to create two derived tables, each extracting the previous and next element in the chain:
-- Pair the min with the closest max. SELECT v1.v AS r1 , MIN(v2.v) AS r2 , v1.id AS id , (MIN(v2.v) - v1.v) AS xrange FROM ( SELECT t1.* FROM t1 LEFT JOIN t1 AS t2 ON t1.v-1 = t2.v AND t1.id = t2.id WHERE t2.id IS NULL ) v1 JOIN ( SELECT t1.* FROM t1 LEFT JOIN t1 AS t2 ON t1.v+1 = t2.v AND t1.id = t2.id WHERE t2.id IS NULL ) v2 ON v1.v <= v2.v AND v1.id = v2.id GROUP BY id , r1 ORDER BY r1 , xrange DESC
The output from this solution would look like:
+----+----+----+--------+ | R1 | R2 | ID | XRANGE | +----+----+----+--------+ | 1 | 2 | 1 | 1 | | 3 | 5 | 2 | 2 | | 6 | 7 | 1 | 1 | +----+----+----+--------+
Running each derived table separately shows the individual logic blocks:
This solution would also cover overlapping ranges.
Window functions, such as lead and lag can be used to handle this a little more cleanly. Unfortunately, MySQL does not implement them, so the derived table approach is required.