MySQL高阶-统计每周连续出现天数
- 2019 年 12 月 26 日
- 筆記
还是昨天的问题,统计每周内问题小区连续出现天数,MySQL可以实现吗?答案是肯定的,就是过程很曲折。 数据库查询语言基本都是针对整列的,不像excel是单元格粒度的,要判断行与行之间的差异比较麻烦。 建表语句如下:
CREATE TABLE `badCells` ( `id` int(11) NOT NULL AUTO_INCREMENT, `cellid` int(11) DEFAULT NULL, `cgi` varchar(10) DEFAULT NULL, `apper_time` date DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
插入数据
INSERT INTO `badCells` VALUES ('1', '1', 'abcd_1', '2019-05-18'); INSERT INTO `badCells` VALUES ('2', '1', 'abcd_1', '2019-05-19'); INSERT INTO `badCells` VALUES ('3', '1', 'abcd_1', '2019-05-20'); INSERT INTO `badCells` VALUES ('4', '1', 'abcd_1', '2019-05-21'); INSERT INTO `badCells` VALUES ('5', '1', 'abcd_1', '2019-05-22'); INSERT INTO `badCells` VALUES ('6', '1', 'abcd_1', '2019-05-23'); INSERT INTO `badCells` VALUES ('7', '1', 'abcd_1', '2019-05-25'); INSERT INTO `badCells` VALUES ('8', '1', 'abcd_1', '2019-05-26'); INSERT INTO `badCells` VALUES ('9', '2', 'xyzz_3', '2019-05-21'); INSERT INTO `badCells` VALUES ('10', '2', 'xyzz_3', '2019-05-26'); INSERT INTO `badCells` VALUES ('11', '2', 'xyzz_3', '2019-05-27'); INSERT INTO `badCells` VALUES ('12', '2', 'xyzz_3', '2019-05-28'); INSERT INTO `badCells` VALUES ('13', '2', 'xyzz_3', '2019-05-30');
表是这样的
mysql> select * from badCells; +----+--------+--------+------------+ | id | cellid | cgi | apper_time | +----+--------+--------+------------+ | 1 | 1 | abcd_1 | 2019-05-18 | | 2 | 1 | abcd_1 | 2019-05-19 | | 3 | 1 | abcd_1 | 2019-05-20 | | 4 | 1 | abcd_1 | 2019-05-21 | | 5 | 1 | abcd_1 | 2019-05-22 | | 6 | 1 | abcd_1 | 2019-05-23 | | 7 | 1 | abcd_1 | 2019-05-25 | | 8 | 1 | abcd_1 | 2019-05-26 | | 9 | 2 | xyzz_3 | 2019-05-21 | | 10 | 2 | xyzz_3 | 2019-05-26 | | 11 | 2 | xyzz_3 | 2019-05-27 | | 12 | 2 | xyzz_3 | 2019-05-28 | | 13 | 2 | xyzz_3 | 2019-05-30 | +----+--------+--------+------------+ 13 rows in set (0.00 sec)
现在使用一种算法来判决每周持续出现,用week()获取日期所在周数,使用下图的逻辑判断连续出现

- 首先获取当前日期与一个较早的日期的差值
- 计算表中大于等于当前日期的行数
- 两个值求和,求和相等则为连续出现,否则为间隔出现
SQL 呼之欲出
mysql> select -> bc.cellid, -> bc.cgi, -> DATEDIFF(bc.apper_time, '2019-05-01' ) as diff, -> ( -> select count(1) -> from badCells bc2 -> where bc2.cgi = bc.cgi and DATEDIFF(bc2.apper_time, bc.apper_time)>=0 -> ) as rows -> from badCells bc; +--------+--------+------+------+ | cellid | cgi | diff | rows | +--------+--------+------+------+ | 1 | abcd_1 | 17 | 8 | | 1 | abcd_1 | 18 | 7 | | 1 | abcd_1 | 19 | 6 | | 1 | abcd_1 | 20 | 5 | | 1 | abcd_1 | 21 | 4 | | 1 | abcd_1 | 22 | 3 | | 1 | abcd_1 | 24 | 2 | | 1 | abcd_1 | 25 | 1 | | 2 | xyzz_3 | 20 | 5 | | 2 | xyzz_3 | 25 | 4 | | 2 | xyzz_3 | 26 | 3 | | 2 | xyzz_3 | 27 | 2 | | 2 | xyzz_3 | 29 | 1 | +--------+--------+------+------+ 13 rows in set (0.00 sec)
得到了 diff 和 rows ,将两者相加,然后按照该列分组计数就是持续天数,再加上周数,就是每周持续天数
完整代码如下:
mysql> select -> bc.cellid, -> bc.cgi, -> count(*) as `con_days`, -> week(bc.apper_time) as `weeknum`, -> min(bc.apper_time) `start_time`, -> max(bc.apper_time) `end_time`, -> DATEDIFF(bc.apper_time, '2019-05-01')+ -> ( -> select count(1) -> from badCells bc2 -> where bc2.cgi = bc.cgi and DATEDIFF(bc2.apper_time, bc.apper_time)>=0 -> ) as `flag` -> from badCells bc -> group by bc.cellid, bc.cgi, flag , weeknum -> order by bc.cellid, min(bc.apper_time); +--------+--------+----------+---------+------------+------------+------+ | cellid | cgi | con_days | weeknum | start_time | end_time | flag | +--------+--------+----------+---------+------------+------------+------+ | 1 | abcd_1 | 1 | 19 | 2019-05-18 | 2019-05-18 | 25 | | 1 | abcd_1 | 5 | 20 | 2019-05-19 | 2019-05-23 | 25 | | 1 | abcd_1 | 1 | 20 | 2019-05-25 | 2019-05-25 | 26 | | 1 | abcd_1 | 1 | 21 | 2019-05-26 | 2019-05-26 | 26 | | 2 | xyzz_3 | 1 | 20 | 2019-05-21 | 2019-05-21 | 25 | | 2 | xyzz_3 | 3 | 21 | 2019-05-26 | 2019-05-28 | 29 | | 2 | xyzz_3 | 1 | 21 | 2019-05-30 | 2019-05-30 | 30 | +--------+--------+----------+---------+------------+------------+------+ 7 rows in set (0.00 sec)
如上,利用MySQL获取了每周连续出现天数,感觉怎么样?


