【DB笔试面试590】在Oracle中,什么是反连接(Anti Join)?

  • 2019 年 10 月 10 日
  • 笔记

题目部分

在Oracle中,什么是反连接(Anti Join)?

答案部分

反连接(Anti Join)也是一种特殊的连接类型,通常用于从一个表中返回不在另一个数据源中的数据行。当做子查询展开时,Oracle经常会把那些外部WHERE条件为NOT EXISTS、NOT IN或<> ALL的子查询转换成对应的反连接。反连接分为嵌套循环反连接(NESTED LOOPS ANTI,Hint为:NL_AJ)、排序合并反连接(MERGE JOIN ANTI,Hint为:MERGE_AJ)和哈希反连接(HASH JOIN ANTI,Hint为:HASH_AJ)。示例如下所示:

CREATE TABLE EMP AS SELECT * FROM SCOTT.EMP;  CREATE TABLE DEPT AS SELECT * FROM SCOTT.DEPT;  SELECT * FROM EMP A WHERE NOT EXISTS(SELECT 1 FROM DEPT WHERE DEPTNO=A.DEPTNO);  ---------------------------------------------------------------------------  | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |  ---------------------------------------------------------------------------  |   0 | SELECT STATEMENT   |      |     2 |    84 |     5  (20)| 00:00:01 |  |*  1 |  HASH JOIN ANTI    |      |     2 |    84 |     5  (20)| 00:00:01 |  |   2 |   TABLE ACCESS FULL| EMP  |    12 |   468 |     2   (0)| 00:00:01 |  |   3 |   TABLE ACCESS FULL| DEPT |     4 |    12 |     2   (0)| 00:00:01 |  ---------------------------------------------------------------------------  SELECT * FROM EMP A WHERE NOT EXISTS(SELECT /*+NL_AJ*/ 1 FROM DEPT WHERE DEPTNO=A.DEPTNO);  ---------------------------------------------------------------------------  | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |  ---------------------------------------------------------------------------  |   0 | SELECT STATEMENT   |      |     2 |    84 |     7   (0)| 00:00:01 |  |   1 |  NESTED LOOPS ANTI |      |     2 |    84 |     7   (0)| 00:00:01 |  |   2 |   TABLE ACCESS FULL| EMP  |    12 |   468 |     2   (0)| 00:00:01 |  |*  3 |   TABLE ACCESS FULL| DEPT |     4 |    12 |     0   (0)| 00:00:01 |  ---------------------------------------------------------------------------  SELECT * FROM EMP A WHERE NOT EXISTS(SELECT /*+MERGE_AJ*/ 1 FROM DEPT WHERE DEPTNO=A.DEPTNO);  ----------------------------------------------------------------------------  | Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |  ----------------------------------------------------------------------------  |   0 | SELECT STATEMENT    |      |     2 |    84 |     6  (34)| 00:00:01 |  |   1 |  MERGE JOIN ANTI    |      |     2 |    84 |     6  (34)| 00:00:01 |  |   2 |   SORT JOIN         |      |    12 |   468 |     3  (34)| 00:00:01 |  |   3 |    TABLE ACCESS FULL| EMP  |    12 |   468 |     2   (0)| 00:00:01 |  |*  4 |   SORT UNIQUE       |      |     4 |    12 |     3  (34)| 00:00:01 |  |   5 |    TABLE ACCESS FULL| DEPT |     4 |    12 |     2   (0)| 00:00:01 |  ----------------------------------------------------------------------------  

需要注意的是,NOT IN和<> ALL对NULL值敏感,这意味着NOT IN后面的子查询或者常量集合一旦有NULL值出现,则整个SQL的执行结果就会为NULL,即此时的执行结果将不包含任何记录。但是,NOT EXISTS对NULL值不敏感,这意味着NULL值对NOT EXISTS的执行结果不会有什么影响。正是因为NOT IN和<> ALL对NULL值敏感,所以一旦相关的连接列上出现了NULL值,此时Oracle如果还按照通常的反连接的处理逻辑来处理,得到的结果就不对了。为了解决NOT IN和<> ALL对NULL值敏感的问题,Oracle推出了改良的反连接,这种反连接能够处理NULL值,Oracle称其为Null-Aware Anti Join,如下例:

SELECT * FROM DEPT A WHERE DEPTNO NOT IN (SELECT DEPTNO FROM EMP);  ---------------------------------------------------------------------------  | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |  ---------------------------------------------------------------------------  |   0 | SELECT STATEMENT   |      |     1 |    23 |     5  (20)| 00:00:01 |  |*  1 |  HASH JOIN ANTI NA |      |     1 |    23 |     5  (20)| 00:00:01 |  |   2 |   TABLE ACCESS FULL| DEPT |     4 |    80 |     2   (0)| 00:00:01 |  |   3 |   TABLE ACCESS FULL| EMP  |    12 |    36 |     2   (0)| 00:00:01 |  ---------------------------------------------------------------------------  

执行步骤的列Operation的值为“HASH JOIN ANTI NA”,关键字“NA”就是Null-Aware的缩写,表示这里采用的不是普通的哈希反连接,而是改良后的、能够处理NULL值的哈希反连接。

在Oracle 11gR2中,Oracle是否启用Null-Aware Anti Join受隐含参数“_OPTIMIZER_NULL_AWARE_ANTIJOIN”控制,其默认值为TRUE,表示启用Null-Aware Anti Join。如果把该参数的值修改为FALSE,那么表示Oracle就不能再用Null-Aware Anti Join了,而又因为NOT IN对NULL值敏感,所以Oracle此时也不能用普通的反连接。关于该隐含参数的查询如下所示:

SYS@orclasm > set pagesize 9999  SYS@orclasm > set line 9999  SYS@orclasm > col NAME format a40  SYS@orclasm > col KSPPDESC format a50  SYS@orclasm > col KSPPSTVL format a20  SYS@orclasm > SELECT a.INDX,    2         a.KSPPINM NAME,    3         a.KSPPDESC,    4         b.KSPPSTVL    5  FROM   x$ksppi  a,    6         x$ksppcv b    7  WHERE  a.INDX = b.INDX    8  and lower(a.KSPPINM) like  lower('%&parameter%');  Enter value for parameter: _OPTIMIZER_NULL_AWARE_ANTIJOIN  old   8: and lower(a.KSPPINM) like  lower('%&parameter%')  new   8: and lower(a.KSPPINM) like  lower('%_OPTIMIZER_NULL_AWARE_ANTIJOIN%')          INDX NAME                                     KSPPDESC                                           KSPPSTVL  ---------- ---------------------------------------- -------------------------------------------------- --------------------        1907 _optimizer_null_aware_antijoin           null-aware antijoin parameter                      TRUE  

本文选自《Oracle程序员面试笔试宝典》,作者:李华荣。