Hive踩过的坑-hive having count 不能去重 发布于 1 年前 (2018年11月26日) – 4

hive在group by xx having count()是不支持去重计数的。

不信的话,你们可以试下类似的代码?

SELECT IMEI,IP,FT,LT,CASE WHEN FT = LT THEN 1 ELSE 0 END IS_O_DAY  FROM temp_gz23g_IP_1878910_2 b  WHERE b.IP IN (  SELECT a.IP FROM temp_gz23g_IP_1878910_2 a WHERE a.IP !=''  GROUP BY a.IP HAVING COUNT(DISTINCT a.IMEI) = 1)

然后在hive执行的话,会报错:Error: Error while compiling statement: FAILED: SemanticException Line 6:36 Invalid column reference 'IMEI' in definition of SubQuery sq_1

如果想执行的话,那就在里面套一层:

SELECT IMEI,ip,FT,LT,CASE WHEN FT = LT THEN 1 ELSE 0 END IS_O_DAY  FROM temp_gz23g_IP_1878910_2  WHERE ip IN(  SELECT ip FROM(  SELECT ip,COUNT(DISTINCT IMEI) d_imei_num FROM temp_gz23g_IP_1878910_2  WHERE ip !='' GROUP BY ip)a where d_imei_num=1)

好了.记录完毕..