Hive Join详解
- 2020 年 4 月 2 日
- 筆記
最近对hive的join用的比较多,特地归纳下常用的各种连接,看看hive的连接和我们普通的是否有不同。创建ta.txt和tb.txt两个文件,加载数据:
hive (cfpd_ods_safe)> load data local inpath '/data/bdp/bdp_etl_deploy/hduser06/jaysonding/ta.txt' into table ta; hive (cfpd_ods_safe)> load data local inpath '/data/bdp/bdp_etl_deploy/hduser06/jaysonding/tb.txt' into table tb;
查询数据:
hive (cfpd_ods_safe)> select * from ta; OK ta.uid 1111 2222 3333 4444 Time taken: 0.087 seconds, Fetched: 4 row(s) hive (cfpd_ods_safe)> select * from tb; OK tb.uid 1111 2222 5555 Time taken: 0.183 seconds, Fetched: 3 row(s)
现在尝试来连接了。
(1)普通的,连接:
ta.uid tb.uid 1111 1111 1111 2222 1111 5555 2222 1111 2222 2222 2222 5555 3333 1111 3333 2222 3333 5555 4444 1111 4444 2222 4444 5555 Time taken: 21.328 seconds, Fetched: 12 row(s)
可见普通逗号,不带条件结果就是一个笛卡尔积。再看带条件的:
hive (cfpd_ods_safe)> select * from ta,tb where ta.uid=tb.uid; ta.uid tb.uid 1111 1111 2222 2222 Time taken: 23.147 seconds, Fetched: 2 row(s)
(2)内连接 inner join:
hive (cfpd_ods_safe)> select * from ta inner join tb on ta.uid=tb.uid; ta.uid tb.uid 1111 1111 2222 2222 Time taken: 21.597 seconds, Fetched: 2 row(s)
可见inner join和直接逗号连接效果是一样的。
(3)左连接left join:
hive (cfpd_ods_safe)> select * from ta left join tb on ta.uid=tb.uid; ta.uid tb.uid 1111 1111 2222 2222 3333 NULL 4444 NULL Time taken: 22.921 seconds, Fetched: 4 row(s)
(5)左外连接 left outer join:
hive (cfpd_ods_safe)> select * from ta left outer join tb on ta.uid=tb.uid; ta.uid tb.uid 1111 1111 2222 2222 3333 NULL 4444 NULL Time taken: 22.637 seconds, Fetched: 4 row(s)
(6)全连接 full join:
hive (cfpd_ods_safe)> select * from ta full join tb on ta.uid=tb.uid; ta.uid tb.uid 1111 1111 2222 2222 3333 NULL 4444 NULL NULL 5555 Time taken: 19.39 seconds, Fetched: 5 row(s)
(7)全外连接 full outer join:
hive (cfpd_ods_safe)> select * from ta full outer join tb on ta.uid=tb.uid; ta.uid tb.uid 1111 1111 2222 2222 3333 NULL 4444 NULL NULL 5555 Time taken: 20.414 seconds, Fetched: 5 row(s)
结论:
(1)inner join效果和逗号连接一样,逗号其实是inner join的简写。
(2)不带条件的所有连接都是笛卡尔积
(3)left join和left outer join是一样的,full join和full outer join是一样的。right一样。