MySQL高階-統計每周連續出現天數

  • 2019 年 12 月 26 日
  • 筆記

還是昨天的問題,統計每周內問題小區連續出現天數,MySQL可以實現嗎?答案是肯定的,就是過程很曲折。 數據庫查詢語言基本都是針對整列的,不像excel是單元格粒度的,要判斷行與行之間的差異比較麻煩。 建表語句如下:

CREATE TABLE `badCells` (  `id` int(11) NOT NULL AUTO_INCREMENT,  `cellid` int(11) DEFAULT NULL,  `cgi` varchar(10) DEFAULT NULL,  `apper_time` date DEFAULT NULL,  PRIMARY KEY (`id`)  ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;  

插入數據

INSERT INTO `badCells` VALUES ('1', '1', 'abcd_1', '2019-05-18');  INSERT INTO `badCells` VALUES ('2', '1', 'abcd_1', '2019-05-19');  INSERT INTO `badCells` VALUES ('3', '1', 'abcd_1', '2019-05-20');  INSERT INTO `badCells` VALUES ('4', '1', 'abcd_1', '2019-05-21');  INSERT INTO `badCells` VALUES ('5', '1', 'abcd_1', '2019-05-22');  INSERT INTO `badCells` VALUES ('6', '1', 'abcd_1', '2019-05-23');  INSERT INTO `badCells` VALUES ('7', '1', 'abcd_1', '2019-05-25');  INSERT INTO `badCells` VALUES ('8', '1', 'abcd_1', '2019-05-26');  INSERT INTO `badCells` VALUES ('9', '2', 'xyzz_3', '2019-05-21');  INSERT INTO `badCells` VALUES ('10', '2', 'xyzz_3', '2019-05-26');  INSERT INTO `badCells` VALUES ('11', '2', 'xyzz_3', '2019-05-27');  INSERT INTO `badCells` VALUES ('12', '2', 'xyzz_3', '2019-05-28');  INSERT INTO `badCells` VALUES ('13', '2', 'xyzz_3', '2019-05-30');  

表是這樣的

mysql> select * from badCells;  +----+--------+--------+------------+  | id | cellid | cgi    | apper_time |  +----+--------+--------+------------+  |  1 |      1 | abcd_1 | 2019-05-18 |  |  2 |      1 | abcd_1 | 2019-05-19 |  |  3 |      1 | abcd_1 | 2019-05-20 |  |  4 |      1 | abcd_1 | 2019-05-21 |  |  5 |      1 | abcd_1 | 2019-05-22 |  |  6 |      1 | abcd_1 | 2019-05-23 |  |  7 |      1 | abcd_1 | 2019-05-25 |  |  8 |      1 | abcd_1 | 2019-05-26 |  |  9 |      2 | xyzz_3 | 2019-05-21 |  | 10 |      2 | xyzz_3 | 2019-05-26 |  | 11 |      2 | xyzz_3 | 2019-05-27 |  | 12 |      2 | xyzz_3 | 2019-05-28 |  | 13 |      2 | xyzz_3 | 2019-05-30 |  +----+--------+--------+------------+  13 rows in set (0.00 sec)  

現在使用一種算法來判決每周持續出現,用week()獲取日期所在周數,使用下圖的邏輯判斷連續出現

  • 首先獲取當前日期與一個較早的日期的差值
  • 計算表中大於等於當前日期的行數
  • 兩個值求和,求和相等則為連續出現,否則為間隔出現

SQL 呼之欲出

mysql> select      -> bc.cellid,      -> bc.cgi,      -> DATEDIFF(bc.apper_time, '2019-05-01' ) as diff,      -> (      -> select count(1)      -> from badCells bc2      -> where bc2.cgi = bc.cgi and DATEDIFF(bc2.apper_time, bc.apper_time)>=0      -> ) as rows      -> from badCells bc;  +--------+--------+------+------+  | cellid | cgi    | diff | rows |  +--------+--------+------+------+  |      1 | abcd_1 |   17 |    8 |  |      1 | abcd_1 |   18 |    7 |  |      1 | abcd_1 |   19 |    6 |  |      1 | abcd_1 |   20 |    5 |  |      1 | abcd_1 |   21 |    4 |  |      1 | abcd_1 |   22 |    3 |  |      1 | abcd_1 |   24 |    2 |  |      1 | abcd_1 |   25 |    1 |  |      2 | xyzz_3 |   20 |    5 |  |      2 | xyzz_3 |   25 |    4 |  |      2 | xyzz_3 |   26 |    3 |  |      2 | xyzz_3 |   27 |    2 |  |      2 | xyzz_3 |   29 |    1 |  +--------+--------+------+------+  13 rows in set (0.00 sec)  

得到了 diffrows ,將兩者相加,然後按照該列分組計數就是持續天數,再加上周數,就是每周持續天數

完整代碼如下:

mysql> select      -> bc.cellid,      -> bc.cgi,      -> count(*) as `con_days`,      -> week(bc.apper_time) as `weeknum`,      -> min(bc.apper_time) `start_time`,      -> max(bc.apper_time) `end_time`,      -> DATEDIFF(bc.apper_time, '2019-05-01')+      -> (      -> select count(1)      -> from badCells bc2      -> where bc2.cgi = bc.cgi and DATEDIFF(bc2.apper_time, bc.apper_time)>=0      -> ) as `flag`      -> from badCells bc      -> group by bc.cellid, bc.cgi, flag , weeknum      -> order by bc.cellid, min(bc.apper_time);  +--------+--------+----------+---------+------------+------------+------+  | cellid | cgi    | con_days | weeknum | start_time | end_time   | flag |  +--------+--------+----------+---------+------------+------------+------+  |      1 | abcd_1 |        1 |      19 | 2019-05-18 | 2019-05-18 |   25 |  |      1 | abcd_1 |        5 |      20 | 2019-05-19 | 2019-05-23 |   25 |  |      1 | abcd_1 |        1 |      20 | 2019-05-25 | 2019-05-25 |   26 |  |      1 | abcd_1 |        1 |      21 | 2019-05-26 | 2019-05-26 |   26 |  |      2 | xyzz_3 |        1 |      20 | 2019-05-21 | 2019-05-21 |   25 |  |      2 | xyzz_3 |        3 |      21 | 2019-05-26 | 2019-05-28 |   29 |  |      2 | xyzz_3 |        1 |      21 | 2019-05-30 | 2019-05-30 |   30 |  +--------+--------+----------+---------+------------+------------+------+  7 rows in set (0.00 sec)  

如上,利用MySQL獲取了每周連續出現天數,感覺怎麼樣?