MySQL高阶-统计每周连续出现天数

  • 2019 年 12 月 26 日
  • 筆記

还是昨天的问题,统计每周内问题小区连续出现天数,MySQL可以实现吗?答案是肯定的,就是过程很曲折。 数据库查询语言基本都是针对整列的,不像excel是单元格粒度的,要判断行与行之间的差异比较麻烦。 建表语句如下:

CREATE TABLE `badCells` (  `id` int(11) NOT NULL AUTO_INCREMENT,  `cellid` int(11) DEFAULT NULL,  `cgi` varchar(10) DEFAULT NULL,  `apper_time` date DEFAULT NULL,  PRIMARY KEY (`id`)  ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;  

插入数据

INSERT INTO `badCells` VALUES ('1', '1', 'abcd_1', '2019-05-18');  INSERT INTO `badCells` VALUES ('2', '1', 'abcd_1', '2019-05-19');  INSERT INTO `badCells` VALUES ('3', '1', 'abcd_1', '2019-05-20');  INSERT INTO `badCells` VALUES ('4', '1', 'abcd_1', '2019-05-21');  INSERT INTO `badCells` VALUES ('5', '1', 'abcd_1', '2019-05-22');  INSERT INTO `badCells` VALUES ('6', '1', 'abcd_1', '2019-05-23');  INSERT INTO `badCells` VALUES ('7', '1', 'abcd_1', '2019-05-25');  INSERT INTO `badCells` VALUES ('8', '1', 'abcd_1', '2019-05-26');  INSERT INTO `badCells` VALUES ('9', '2', 'xyzz_3', '2019-05-21');  INSERT INTO `badCells` VALUES ('10', '2', 'xyzz_3', '2019-05-26');  INSERT INTO `badCells` VALUES ('11', '2', 'xyzz_3', '2019-05-27');  INSERT INTO `badCells` VALUES ('12', '2', 'xyzz_3', '2019-05-28');  INSERT INTO `badCells` VALUES ('13', '2', 'xyzz_3', '2019-05-30');  

表是这样的

mysql> select * from badCells;  +----+--------+--------+------------+  | id | cellid | cgi    | apper_time |  +----+--------+--------+------------+  |  1 |      1 | abcd_1 | 2019-05-18 |  |  2 |      1 | abcd_1 | 2019-05-19 |  |  3 |      1 | abcd_1 | 2019-05-20 |  |  4 |      1 | abcd_1 | 2019-05-21 |  |  5 |      1 | abcd_1 | 2019-05-22 |  |  6 |      1 | abcd_1 | 2019-05-23 |  |  7 |      1 | abcd_1 | 2019-05-25 |  |  8 |      1 | abcd_1 | 2019-05-26 |  |  9 |      2 | xyzz_3 | 2019-05-21 |  | 10 |      2 | xyzz_3 | 2019-05-26 |  | 11 |      2 | xyzz_3 | 2019-05-27 |  | 12 |      2 | xyzz_3 | 2019-05-28 |  | 13 |      2 | xyzz_3 | 2019-05-30 |  +----+--------+--------+------------+  13 rows in set (0.00 sec)  

现在使用一种算法来判决每周持续出现,用week()获取日期所在周数,使用下图的逻辑判断连续出现

  • 首先获取当前日期与一个较早的日期的差值
  • 计算表中大于等于当前日期的行数
  • 两个值求和,求和相等则为连续出现,否则为间隔出现

SQL 呼之欲出

mysql> select      -> bc.cellid,      -> bc.cgi,      -> DATEDIFF(bc.apper_time, '2019-05-01' ) as diff,      -> (      -> select count(1)      -> from badCells bc2      -> where bc2.cgi = bc.cgi and DATEDIFF(bc2.apper_time, bc.apper_time)>=0      -> ) as rows      -> from badCells bc;  +--------+--------+------+------+  | cellid | cgi    | diff | rows |  +--------+--------+------+------+  |      1 | abcd_1 |   17 |    8 |  |      1 | abcd_1 |   18 |    7 |  |      1 | abcd_1 |   19 |    6 |  |      1 | abcd_1 |   20 |    5 |  |      1 | abcd_1 |   21 |    4 |  |      1 | abcd_1 |   22 |    3 |  |      1 | abcd_1 |   24 |    2 |  |      1 | abcd_1 |   25 |    1 |  |      2 | xyzz_3 |   20 |    5 |  |      2 | xyzz_3 |   25 |    4 |  |      2 | xyzz_3 |   26 |    3 |  |      2 | xyzz_3 |   27 |    2 |  |      2 | xyzz_3 |   29 |    1 |  +--------+--------+------+------+  13 rows in set (0.00 sec)  

得到了 diffrows ,将两者相加,然后按照该列分组计数就是持续天数,再加上周数,就是每周持续天数

完整代码如下:

mysql> select      -> bc.cellid,      -> bc.cgi,      -> count(*) as `con_days`,      -> week(bc.apper_time) as `weeknum`,      -> min(bc.apper_time) `start_time`,      -> max(bc.apper_time) `end_time`,      -> DATEDIFF(bc.apper_time, '2019-05-01')+      -> (      -> select count(1)      -> from badCells bc2      -> where bc2.cgi = bc.cgi and DATEDIFF(bc2.apper_time, bc.apper_time)>=0      -> ) as `flag`      -> from badCells bc      -> group by bc.cellid, bc.cgi, flag , weeknum      -> order by bc.cellid, min(bc.apper_time);  +--------+--------+----------+---------+------------+------------+------+  | cellid | cgi    | con_days | weeknum | start_time | end_time   | flag |  +--------+--------+----------+---------+------------+------------+------+  |      1 | abcd_1 |        1 |      19 | 2019-05-18 | 2019-05-18 |   25 |  |      1 | abcd_1 |        5 |      20 | 2019-05-19 | 2019-05-23 |   25 |  |      1 | abcd_1 |        1 |      20 | 2019-05-25 | 2019-05-25 |   26 |  |      1 | abcd_1 |        1 |      21 | 2019-05-26 | 2019-05-26 |   26 |  |      2 | xyzz_3 |        1 |      20 | 2019-05-21 | 2019-05-21 |   25 |  |      2 | xyzz_3 |        3 |      21 | 2019-05-26 | 2019-05-28 |   29 |  |      2 | xyzz_3 |        1 |      21 | 2019-05-30 | 2019-05-30 |   30 |  +--------+--------+----------+---------+------------+------------+------+  7 rows in set (0.00 sec)  

如上,利用MySQL获取了每周连续出现天数,感觉怎么样?