MySQL高階-統計每周連續出現天數
- 2019 年 12 月 26 日
- 筆記
還是昨天的問題,統計每周內問題小區連續出現天數,MySQL可以實現嗎?答案是肯定的,就是過程很曲折。 數據庫查詢語言基本都是針對整列的,不像excel
是單元格粒度的,要判斷行與行之間的差異比較麻煩。 建表語句如下:
CREATE TABLE `badCells` ( `id` int(11) NOT NULL AUTO_INCREMENT, `cellid` int(11) DEFAULT NULL, `cgi` varchar(10) DEFAULT NULL, `apper_time` date DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
插入數據
INSERT INTO `badCells` VALUES ('1', '1', 'abcd_1', '2019-05-18'); INSERT INTO `badCells` VALUES ('2', '1', 'abcd_1', '2019-05-19'); INSERT INTO `badCells` VALUES ('3', '1', 'abcd_1', '2019-05-20'); INSERT INTO `badCells` VALUES ('4', '1', 'abcd_1', '2019-05-21'); INSERT INTO `badCells` VALUES ('5', '1', 'abcd_1', '2019-05-22'); INSERT INTO `badCells` VALUES ('6', '1', 'abcd_1', '2019-05-23'); INSERT INTO `badCells` VALUES ('7', '1', 'abcd_1', '2019-05-25'); INSERT INTO `badCells` VALUES ('8', '1', 'abcd_1', '2019-05-26'); INSERT INTO `badCells` VALUES ('9', '2', 'xyzz_3', '2019-05-21'); INSERT INTO `badCells` VALUES ('10', '2', 'xyzz_3', '2019-05-26'); INSERT INTO `badCells` VALUES ('11', '2', 'xyzz_3', '2019-05-27'); INSERT INTO `badCells` VALUES ('12', '2', 'xyzz_3', '2019-05-28'); INSERT INTO `badCells` VALUES ('13', '2', 'xyzz_3', '2019-05-30');
表是這樣的
mysql> select * from badCells; +----+--------+--------+------------+ | id | cellid | cgi | apper_time | +----+--------+--------+------------+ | 1 | 1 | abcd_1 | 2019-05-18 | | 2 | 1 | abcd_1 | 2019-05-19 | | 3 | 1 | abcd_1 | 2019-05-20 | | 4 | 1 | abcd_1 | 2019-05-21 | | 5 | 1 | abcd_1 | 2019-05-22 | | 6 | 1 | abcd_1 | 2019-05-23 | | 7 | 1 | abcd_1 | 2019-05-25 | | 8 | 1 | abcd_1 | 2019-05-26 | | 9 | 2 | xyzz_3 | 2019-05-21 | | 10 | 2 | xyzz_3 | 2019-05-26 | | 11 | 2 | xyzz_3 | 2019-05-27 | | 12 | 2 | xyzz_3 | 2019-05-28 | | 13 | 2 | xyzz_3 | 2019-05-30 | +----+--------+--------+------------+ 13 rows in set (0.00 sec)
現在使用一種算法來判決每周持續出現,用week()
獲取日期所在周數,使用下圖的邏輯判斷連續出現

- 首先獲取當前日期與一個較早的日期的差值
- 計算表中大於等於當前日期的行數
- 兩個值求和,求和相等則為連續出現,否則為間隔出現
SQL 呼之欲出
mysql> select -> bc.cellid, -> bc.cgi, -> DATEDIFF(bc.apper_time, '2019-05-01' ) as diff, -> ( -> select count(1) -> from badCells bc2 -> where bc2.cgi = bc.cgi and DATEDIFF(bc2.apper_time, bc.apper_time)>=0 -> ) as rows -> from badCells bc; +--------+--------+------+------+ | cellid | cgi | diff | rows | +--------+--------+------+------+ | 1 | abcd_1 | 17 | 8 | | 1 | abcd_1 | 18 | 7 | | 1 | abcd_1 | 19 | 6 | | 1 | abcd_1 | 20 | 5 | | 1 | abcd_1 | 21 | 4 | | 1 | abcd_1 | 22 | 3 | | 1 | abcd_1 | 24 | 2 | | 1 | abcd_1 | 25 | 1 | | 2 | xyzz_3 | 20 | 5 | | 2 | xyzz_3 | 25 | 4 | | 2 | xyzz_3 | 26 | 3 | | 2 | xyzz_3 | 27 | 2 | | 2 | xyzz_3 | 29 | 1 | +--------+--------+------+------+ 13 rows in set (0.00 sec)
得到了 diff
和 rows
,將兩者相加,然後按照該列分組計數就是持續天數,再加上周數,就是每周持續天數
完整代碼如下:
mysql> select -> bc.cellid, -> bc.cgi, -> count(*) as `con_days`, -> week(bc.apper_time) as `weeknum`, -> min(bc.apper_time) `start_time`, -> max(bc.apper_time) `end_time`, -> DATEDIFF(bc.apper_time, '2019-05-01')+ -> ( -> select count(1) -> from badCells bc2 -> where bc2.cgi = bc.cgi and DATEDIFF(bc2.apper_time, bc.apper_time)>=0 -> ) as `flag` -> from badCells bc -> group by bc.cellid, bc.cgi, flag , weeknum -> order by bc.cellid, min(bc.apper_time); +--------+--------+----------+---------+------------+------------+------+ | cellid | cgi | con_days | weeknum | start_time | end_time | flag | +--------+--------+----------+---------+------------+------------+------+ | 1 | abcd_1 | 1 | 19 | 2019-05-18 | 2019-05-18 | 25 | | 1 | abcd_1 | 5 | 20 | 2019-05-19 | 2019-05-23 | 25 | | 1 | abcd_1 | 1 | 20 | 2019-05-25 | 2019-05-25 | 26 | | 1 | abcd_1 | 1 | 21 | 2019-05-26 | 2019-05-26 | 26 | | 2 | xyzz_3 | 1 | 20 | 2019-05-21 | 2019-05-21 | 25 | | 2 | xyzz_3 | 3 | 21 | 2019-05-26 | 2019-05-28 | 29 | | 2 | xyzz_3 | 1 | 21 | 2019-05-30 | 2019-05-30 | 30 | +--------+--------+----------+---------+------------+------------+------+ 7 rows in set (0.00 sec)
如上,利用MySQL
獲取了每周連續出現天數,感覺怎麼樣?