【DB筆試面試677】在Oracle中,對於一個NUMBER(1)的列,若WHERE條件是大於3和大於等於4,這二者是否等價?
- 2019 年 11 月 5 日
- 筆記
題目部分
在Oracle中,對於一個NUMBER(1)的列,如果查詢中的WHERE條件分別是大於3和大於等於4,那麼這二者是否等價?
♣
答案部分
首先對於查詢結果而言,二者沒有任何區別。從這一點上講無論是指定大於3還是指定大於等於4,二者結果都是一樣的。但是,結果集一樣並不代表二者等價,主要表現為以下幾點:
① 在CHECK約束下,如果表屬於非SYS用戶,那麼大於3會執行全表掃描;而大於等於4在經過CHECK約束的檢查後,通過FILTER結束查詢,能夠更高效地返回結果,不用掃描全表。如果表屬於SYS用戶,那麼這二者的執行計劃是相同的。因為,若表屬於非SYS用戶,則最終經過查詢轉換後的執行SQL為:「SELECT "T_NUM1_LHR"."ID" "ID" FROM "SYS"."T_NUM1_LHR" "T_NUM1_LHR" WHERE "T_NUM1_LHR"."ID">=4 AND 4>4」,而若表屬於SYS用戶,則最終經過查詢轉換後的執行SQL為:「SELECT "T_NUM1_LHR"."ID" "ID" FROM "LHR"."T_NUM1_LHR" "T_NUM1_LHR" WHERE "T_NUM1_LHR"."ID">=4」,所以,在非SYS用戶下,最終的執行計劃中會有「filter(NULL IS NOT NULL)」的謂詞條件。
② 在使用索引的時候,由於Oracle索引結構的特點,兩者掃描的節點都是從4開始,在執行計劃、邏輯讀和執行時間等各方面都不存在性能差異。
③ 在使用物化視圖的過程中,大於3會同時掃描物化視圖和原表,效率較低;而大於等於4會直接掃描物化視圖,效率較高。
由此可見,在返回結果集相同的情況下,使用大於等於代替大於在某些特殊情況下可以帶來SQL語句性能上的提升。總結一下,如下圖所示:

對於這幾種情況分別實驗如下:
SYS@orclasm > select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production PL/SQL Release 11.2.0.3.0 - Production CORE 11.2.0.3.0 Production TNS for Linux: Version 11.2.0.3.0 - Production NLSRTL Version 11.2.0.3.0 - Production
(一)在CHECK約束下,二者的執行計劃是不一樣的。
DROP TABLE T_NUM1_LHR; CREATE TABLE T_NUM1_LHR(ID NUMBER(1)); ALTER TABLE T_NUM1_LHR ADD CHECK(ID <4); SET AUTOT ON SELECT * FROM T_NUM1_LHR WHERE ID>3; SELECT * FROM T_NUM1_LHR WHERE ID>=4; LHR@orclasm > SELECT * FROM T_NUM1_LHR WHERE ID>3; no rows selected Elapsed: 00:00:00.00 Execution Plan ---------------------------------------------------------- Plan hash value: 2700622406 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| T_NUM1_LHR | 1 | 13 | 2 (0)| 00:00:01 | -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("ID">3) Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 0 consistent gets 0 physical reads 0 redo size 330 bytes sent via SQL*Net to client 509 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed LHR@orclasm > SELECT * FROM T_NUM1_LHR WHERE ID>=4; no rows selected Elapsed: 00:00:00.00 Execution Plan ---------------------------------------------------------- Plan hash value: 3764107410 --------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 | 0 (0)| | |* 1 | FILTER | | | | | | |* 2 | TABLE ACCESS FULL| T_NUM1_LHR | 1 | 13 | 2 (0)| 00:00:01 | --------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(NULL IS NOT NULL) 2 - filter("ID">=4) Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 0 consistent gets 0 physical reads 0 redo size 330 bytes sent via SQL*Net to client 509 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed
如果表中恰好有上面的CHECK約束,那麼可以發現,對於大於3和大於等於4這兩個SQL的執行計劃是不一致的。對於後者,由於查詢的條件違反了CHECK約束,因此Oracle在執行計劃前面增加了一個FILTER,使得整個查詢不需要在執行,因此這個查詢不管表中數據有多少,都會在瞬間結束。
而對於大於3這種情況,雖然根據CHECK的約束和列定義,可以推斷出這條查詢不會返回任何記錄,但是Oracle的優化器並沒有聰明到根據列的精度來進行分析,因此這個查詢會執行全表掃描。也就是說,雖然這兩個查詢的最終結果一樣,但是執行計劃並不相同,而且對於大表而言,這種情況下性能也有較大的差別。
當然這種CHECK約束是特例的情況,一般情況下不會出現。原則上到底是選擇大於3還是大於等於4,應該根據具體的業務來決定,而不要嘗試利用Oracle的數據精度來設置查詢條件。如果以後一旦欄位的結構發生了修改,比如這個例子中欄位的允許出現小數,那麼這兩個SQL的WHERE條件就不再等價了。
若表屬於SYS用戶,則這二者的執行計劃是相同的。
下面通過10053事件查看具體原因:
ALTER SESSION SET EVENTS '10053 trace name context forever, level 1'; SELECT * FROM T_NUM1_LHR WHERE ID >= 4; ALTER SESSION SET EVENTS '10053 trace name context off'; SELECT VALUE FROM V$DIAG_INFO WHERE NAME='Default Trace File';
SYS用戶:
try to generate transitive predicate from check constraints for query block SEL$1 (#0) finally: "T_NUM1_LHR"."ID">=4 apadrv-start sqlid=4141557682765762850 : call(in-use=1400, alloc=16344), compile(in-use=54632, alloc=55568), execution(in-use=2480, alloc=4032) ******************************************* Peeked values of the binds in SQL statement ******************************************* Final query after transformations:******* UNPARSED QUERY IS ******* SELECT "T_NUM1_LHR"."ID" "ID" FROM "SYS"."T_NUM1_LHR" "T_NUM1_LHR" WHERE "T_NUM1_LHR"."ID">=4 kkoqbc: optimizing query block SEL$1 (#0)
普通用戶:
try to generate transitive predicate from check constraints for query block SEL$1 (#0) constraint: "T_NUM1_LHR"."ID"<4 finally: "T_NUM1_LHR"."ID">=4 AND 4>4 FPD: transitive predicates are generated in query block SEL$1 (#0) "T_NUM1_LHR"."ID">=4 AND 4>4 apadrv-start sqlid=11964066854041036881 : call(in-use=1696, alloc=16344), compile(in-use=55176, alloc=58488), execution(in-use=2744, alloc=4032) ******************************************* Peeked values of the binds in SQL statement ******************************************* Final query after transformations:******* UNPARSED QUERY IS ******* SELECT "T_NUM1_LHR"."ID" "ID" FROM "LHR"."T_NUM1_LHR" "T_NUM1_LHR" WHERE "T_NUM1_LHR"."ID">=4 AND 4>4 kkoqbc: optimizing query block SEL$1 (#0)
(二)在有索引的情況下,二者的性能是否有差異
DROP TABLE T_NUM2_LHR; CREATE TABLE T_NUM2_LHR(ID NUMBER,NAME VARCHAR2(30)); CREATE INDEX IND_TNUM2_ID ON T_NUM2_LHR(ID); INSERT INTO T_NUM2_LHR SELECT 3,OBJECT_NAME FROM DBA_OBJECTS; INSERT INTO T_NUM2_LHR SELECT * FROM T_NUM2_LHR; INSERT INTO T_NUM2_LHR SELECT * FROM T_NUM2_LHR; INSERT INTO T_NUM2_LHR SELECT * FROM T_NUM2_LHR; INSERT INTO T_NUM2_LHR SELECT * FROM T_NUM2_LHR; COMMIT; INSERT INTO T_NUM2_LHR VALUES(4,'test'); COMMIT; SET TIMING ON SET AUTOT ON SELECT * FROM T_NUM2_LHR WHERE ID>3; SELECT * FROM T_NUM2_LHR WHERE ID>=4; LHR@orclasm > SELECT * FROM T_NUM2_LHR WHERE ID>3; ID NAME ---------- ------------------------------ 4 test Elapsed: 00:00:00.00 Execution Plan ---------------------------------------------------------- Plan hash value: 4021107501 -------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 30 | 1 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T_NUM2_LHR | 1 | 30 | 1 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IND_TNUM2_ID | 1 | | 1 (0)| 00:00:01 | -------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID">3) Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 595 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed LHR@orclasm > SELECT * FROM T_NUM2_LHR WHERE ID>=4; ID NAME ---------- ------------------------------ 4 test Elapsed: 00:00:00.00 Execution Plan ---------------------------------------------------------- Plan hash value: 4021107501 -------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 30 | 1 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T_NUM2_LHR | 1 | 30 | 1 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IND_TNUM2_ID | 1 | | 1 (0)| 00:00:01 | -------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID">=4) Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 595 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
可以看到,無論是執行時間,還是邏輯讀,兩個SQL沒有任何的差別。根據Oracle索引結構的特點,無論是大於3還是大於等於4,這二者的查詢所掃描的葉節點都是同一個,因此,在這一點上不會存在性能的差別。
(三)在使用物化視圖上的差別
如果表上建立了可查詢重寫的物化視圖,那麼這兩個查詢在是否使用物化視圖上有所差別。
CREATE TABLE T_NUM3_LHR(ID NUMBER,NUM NUMBER(1)); ALTER TABLE T_NUM3_LHR ADD PRIMARY KEY(ID); INSERT INTO T_NUM3_LHR SELECT ROWNUM,MOD(ROWNUM,4) FROM DBA_OBJECTS; INSERT INTO T_NUM3_LHR SELECT ROWNUM+54916,MOD(ROWNUM,4) FROM T_NUM3_LHR; INSERT INTO T_NUM3_LHR SELECT ROWNUM+109832,MOD(ROWNUM,4) FROM T_NUM3_LHR; INSERT INTO T_NUM3_LHR SELECT ROWNUM+219664,MOD(ROWNUM,4) FROM T_NUM3_LHR; INSERT INTO T_NUM3_LHR SELECT ROWNUM+439328,MOD(ROWNUM,4) FROM T_NUM3_LHR; COMMIT; INSERT INTO T_NUM3_LHR VALUES(1000000,4); COMMIT; SET AUTOT ON SELECT * FROM T_NUM3_LHR WHERE NUM>3; SELECT * FROM T_NUM3_LHR WHERE NUM>=4; LHR@orclasm > SET AUTOT ON LHR@orclasm > SELECT * FROM T_NUM3_LHR WHERE NUM>3; ID NUM ---------- ---------- 1000000 4 Elapsed: 00:00:00.01 Execution Plan ---------------------------------------------------------- Plan hash value: 621453705 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 12 | 312 | 314 (3)| 00:00:04 | |* 1 | TABLE ACCESS FULL| T_NUM3_LHR | 12 | 312 | 314 (3)| 00:00:04 | -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("NUM">3) Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 0 recursive calls 1 db block gets 1150 consistent gets 0 physical reads 0 redo size 588 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed LHR@orclasm > SELECT * FROM T_NUM3_LHR WHERE NUM>=4; ID NUM ---------- ---------- 1000000 4 Elapsed: 00:00:00.01 Execution Plan ---------------------------------------------------------- Plan hash value: 621453705 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 12 | 312 | 314 (3)| 00:00:04 | |* 1 | TABLE ACCESS FULL| T_NUM3_LHR | 12 | 312 | 314 (3)| 00:00:04 | -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("NUM">=4) Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 0 recursive calls 1 db block gets 1150 consistent gets 0 physical reads 0 redo size 588 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
由於採用的都是全表掃描,二者執行的時間和邏輯讀完全一樣。
下面建立一個物化視圖:
SET AUTOT OFF CREATE MATERIALIZED VIEW LOG ON T_NUM3_LHR WITH(NUM); CREATE MATERIALIZED VIEW MV_T_NUM3_LHR REFRESH FAST ENABLE QUERY REWRITE AS SELECT ID,NUM FROM T_NUM3_LHR WHERE NUM>=4; LHR@orclasm > SET AUTOT OFF LHR@orclasm > CREATE MATERIALIZED VIEW LOG ON T_NUM3_LHR WITH(NUM); Materialized view log created. LHR@orclasm > CREATE MATERIALIZED VIEW MV_T_NUM3_LHR REFRESH FAST ENABLE QUERY REWRITE AS SELECT ID,NUM FROM T_NUM3_LHR WHERE NUM>=4; Materialized view created. LHR@orclasm > show parameter query NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ query_rewrite_enabled string TRUE query_rewrite_integrity string enforced LHR@orclasm > SET AUTOT ON LHR@orclasm > SELECT * FROM T_NUM3_LHR WHERE NUM>3; ID NUM ---------- ---------- 1000000 4 Elapsed: 00:00:00.01 Execution Plan ---------------------------------------------------------- SELECT * FROM T_NUM3_LHR WHERE NUM>=4; Plan hash value: 4012093353 ------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 13 | 338 | 317 (3)| 00:00:04 | | 1 | VIEW | | 13 | 338 | 317 (3)| 00:00:04 | | 2 | UNION-ALL | | | | | | | 3 | MAT_VIEW REWRITE ACCESS FULL| MV_T_NUM3_LHR | 1 | 26 | 3 (0)| 00:00:01 | |* 4 | TABLE ACCESS FULL | T_NUM3_LHR | 12 | 312 | 314 (3)| 00:00:04 | ------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 4 - filter("NUM">3 AND "NUM"<4) Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 0 recursive calls 1 db block gets 1153 consistent gets 0 physical reads 0 redo size 588 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed LHR@orclasm > ID NUM ---------- ---------- 1000000 4 Elapsed: 00:00:00.00 Execution Plan ---------------------------------------------------------- Plan hash value: 4274348025 ---------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 26 | 3 (0)| 00:00:01 | | 1 | MAT_VIEW REWRITE ACCESS FULL| MV_T_NUM3_LHR | 1 | 26 | 3 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 3 consistent gets 0 physical reads 0 redo size 592 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
從執行計劃可以看到,對於大於等於4的情況,Oracle直接掃描了物化視圖了。而對於大於3的情況,Oracle同時掃描了物化視圖和原表,顯然效率比較低。
這個例子其實和第一個例子很類似。雖然根據欄位類型可以判斷出大於3和大於等於4是等價的,但是對於CBO來說,並不會將數據類型的因素考慮進去。因此導致兩個查詢在使用物化視圖時執行計劃的區別。
本文選自《Oracle程式設計師面試筆試寶典》,作者:小麥苗