【DB筆試面試591】在Oracle中,什麼是星型連接(Star Join)?

  • 2019 年 10 月 10 日
  • 筆記

題目部分

在Oracle中,什麼是星型連接(Star Join)?

答案部分

星型連接(Star Join)通常用於數據倉庫類型的應用,它是一種單個事實表(Fact Table)和多個維度表(Dimension Table)之間的連接。星型連接的各維度表之間沒有直接的關聯條件,其事實表和各維度表之間是基於事實表的外鍵列和對應維度表的主鍵列之間的連接,並且通常在事實表的外鍵列上還會存在對應的點陣圖索引。星型轉換可以有效改善大的事實表與多個具有良好選擇率的維度表間連接的查詢,從而有效避免了全表掃描的性能窘境。

星型轉換由初始化參數STAR_TRANSFORMATION_ENABLED控制,該參數可以有三種選項:

① TRUE:CBO優化器自動識別語句中的事實表和約束維度表並進行星型轉換。CBO優化器需要確定轉換後的執行計劃成本要低於不轉換的執行計劃.如果利用物化的臨時表性能更高,那麼CBO優化器還會嘗試利用物化的臨時表。

② FALSE:優化器不會考慮星型轉換,為參數STAR_TRANSFORMATION_ENABLED的默認值。

③ TEMP_DISABLE:當一個維度表超過100個塊時,如果簡單地設置STAR_TRANSFORMATION_ENABLED為TRUE來啟用星型變換,那麼會話會創建一個記憶體中的全局臨時表(Global Temporary Table)來保存已過濾的維度數據,這在過去會造成很多問題。這裡所說的100個塊其實是由隱含參數「_temp_tran_block_threshold」(number of blocks for a dimension before we temp transform)來控制的。此外,隱含參數「_temp_tran_cache」(determines if temp table is created with cache option,默認為TRUE)決定了這類臨時表是否被快取住。為了避免創建全局臨時表可能帶來的問題,就可以用到TEMP_DISABLE這個禁用臨時表的選項,讓優化器不再考慮使用物化的臨時表。

參數STAR_TRANSFORMATION_ENABLED的默認值為FALSE,因為星型轉換適用的場景是數據倉庫環境中具有星型模型的模式,而且需要事實表的各個連接列上均有良好的索引時才能發揮其優勢。如果能確定以上因素,那麼就可以使用星型轉換。

下面給出一個示例:

SYS@orclasm > show parameter star_transformation_enabled    NAME                                 TYPE        VALUE  ------------------------------------ ----------- ------------------------------  star_transformation_enabled          string      FALSE    SYS@orclasm > conn sh/sh  Connected.  SH@orclasm > set autotr trace exp;  SH@orclasm > SELECT CH.CHANNEL_CLASS,    2         C.CUST_CITY,    3         T.CALENDAR_QUARTER_DESC,    4         SUM(S.AMOUNT_SOLD) SALES_AMOUNT    5    FROM SALES S, TIMES T, CUSTOMERS C, CHANNELS CH    6   WHERE S.TIME_ID = T.TIME_ID    7     AND S.CUST_ID = C.CUST_ID    8     AND S.CHANNEL_ID = CH.CHANNEL_ID    9     AND C.CUST_STATE_PROVINCE = 'CA'   10     AND CH.CHANNEL_DESC IN ('Internet', 'Catalog')   11     AND T.CALENDAR_QUARTER_DESC IN ('1999-Q1', '1999-Q2')   12   GROUP BY CH.CHANNEL_CLASS, C.CUST_CITY, T.CALENDAR_QUARTER_DESC;    Execution Plan  ----------------------------------------------------------    Plan hash value: 593420798    -------------------------------------------------------------------------------------------------------------  | Id  | Operation                       | Name      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |  -------------------------------------------------------------------------------------------------------------  |   0 | SELECT STATEMENT                |           |   572 | 48048 |   964   (3)| 00:00:12 |       |       |  |   1 |  HASH GROUP BY                  |           |   572 | 48048 |   964   (3)| 00:00:12 |       |       |  |*  2 |   HASH JOIN                     |           |  3116 |   255K|   963   (3)| 00:00:12 |       |       |  |*  3 |    TABLE ACCESS FULL            | CHANNELS  |     2 |    42 |     3   (0)| 00:00:01 |       |       |  |*  4 |    HASH JOIN                    |           |  6231 |   383K|   960   (3)| 00:00:12 |       |       |  |   5 |     PART JOIN FILTER CREATE     | :BF0000   |   183 |  2928 |    18   (0)| 00:00:01 |       |       |  |*  6 |      TABLE ACCESS FULL          | TIMES     |   183 |  2928 |    18   (0)| 00:00:01 |       |       |  |*  7 |     HASH JOIN                   |           | 49822 |  2286K|   941   (3)| 00:00:12 |       |       |  |*  8 |      TABLE ACCESS FULL          | CUSTOMERS |   383 |  9958 |   406   (1)| 00:00:05 |       |       |  |   9 |      PARTITION RANGE JOIN-FILTER|           |   918K|    18M|   530   (3)| 00:00:07 |:BF0000|:BF0000|  |  10 |       TABLE ACCESS FULL         | SALES     |   918K|    18M|   530   (3)| 00:00:07 |:BF0000|:BF0000|  -------------------------------------------------------------------------------------------------------------    Predicate Information (identified by operation id):  ---------------------------------------------------       2 - access("S"."CHANNEL_ID"="CH"."CHANNEL_ID")     3 - filter("CH"."CHANNEL_DESC"='Catalog' OR "CH"."CHANNEL_DESC"='Internet')     4 - access("S"."TIME_ID"="T"."TIME_ID")     6 - filter("T"."CALENDAR_QUARTER_DESC"='1999-Q1' OR "T"."CALENDAR_QUARTER_DESC"='1999-Q2')     7 - access("S"."CUST_ID"="C"."CUST_ID")     8 - filter("C"."CUST_STATE_PROVINCE"='CA')  

發現上面的執行計劃與三個表連接時都是全表掃描,下面打開星形轉換後再執行同樣的SQL語句:

SYS@orclasm > alter system set star_transformation_enabled=true;    System altered.    SH@orclasm > SH@orclasm > set autotr trace exp;  SH@orclasm > SELECT CH.CHANNEL_CLASS,    2         C.CUST_CITY,    3         T.CALENDAR_QUARTER_DESC,    4         SUM(S.AMOUNT_SOLD) SALES_AMOUNT    FROM SALES S, TIMES T, CUSTOMERS C, CHANNELS CH   WHERE S.TIME_ID = T.TIME_ID     AND S.CUST_ID = C.CUST_ID     AND S.CHANNEL_ID = CH.CHANNEL_ID     AND C.CUST_STATE_PROVINCE = 'CA'     AND CH.CHANNEL_DESC IN ('Internet', 'Catalog')     AND T.CALENDAR_QUARTER_DESC IN ('1999-Q1', '1999-Q2')   GROUP BY CH.CHANNEL_CLASS, C.CUST_CITY, T.CALENDAR_QUARTER_DESC;    Execution Plan  ----------------------------------------------------------    Plan hash value: 478886510    ----------------------------------------------------------------------------------------------------------------------------------  | Id  | Operation                           | Name                       | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |  ----------------------------------------------------------------------------------------------------------------------------------  |   0 | SELECT STATEMENT                    |                            |   253 | 20493 |   556   (1)| 00:00:07 |       |       |  |   1 |  TEMP TABLE TRANSFORMATION          |                            |       |       |            |          |       |       |  |   2 |   LOAD AS SELECT                    | SYS_TEMP_0FD9D660D_466CBF7 |       |       |            |          |       |       |  |*  3 |    TABLE ACCESS FULL                | CUSTOMERS                  |   383 |  9958 |   406   (1)| 00:00:05 |       |       |  |   4 |   HASH GROUP BY                     |                            |   253 | 20493 |   150   (2)| 00:00:02 |       |       |  |*  5 |    HASH JOIN                        |                            |   253 | 20493 |   149   (2)| 00:00:02 |       |       |  |*  6 |     TABLE ACCESS FULL               | CHANNELS                   |     2 |    42 |     3   (0)| 00:00:01 |       |       |  |*  7 |     HASH JOIN                       |                            |   253 | 15180 |   146   (2)| 00:00:02 |       |       |  |*  8 |      HASH JOIN                      |                            |   253 | 11385 |   143   (1)| 00:00:02 |       |       |  |*  9 |       TABLE ACCESS FULL             | TIMES                      |   183 |  2928 |    18   (0)| 00:00:01 |       |       |  |  10 |       VIEW                          | VW_ST_62EEF96F             |   254 |  7366 |   125   (1)| 00:00:02 |       |       |  |  11 |        NESTED LOOPS                 |                            |   254 | 14478 |   101   (0)| 00:00:02 |       |       |  |  12 |         PARTITION RANGE SUBQUERY    |                            |   254 |  7117 |    55   (2)| 00:00:01 |KEY(SQ)|KEY(SQ)|  |  13 |          BITMAP CONVERSION TO ROWIDS|                            |   254 |  7117 |    55   (2)| 00:00:01 |       |       |  |  14 |           BITMAP AND                |                            |       |       |            |          |       |       |  |  15 |            BITMAP MERGE             |                            |       |       |            |          |       |       |  |  16 |             BITMAP KEY ITERATION    |                            |       |       |            |          |       |       |  |  17 |              BUFFER SORT            |                            |       |       |            |          |       |       |  |* 18 |               TABLE ACCESS FULL     | CHANNELS                   |     2 |    26 |     3   (0)| 00:00:01 |       |       |  |* 19 |              BITMAP INDEX RANGE SCAN| SALES_CHANNEL_BIX          |       |       |            |          |KEY(SQ)|KEY(SQ)|  |  20 |            BITMAP MERGE             |                            |       |       |            |          |       |       |  |  21 |             BITMAP KEY ITERATION    |                            |       |       |            |          |       |       |  |  22 |              BUFFER SORT            |                            |       |       |            |          |       |       |  |* 23 |               TABLE ACCESS FULL     | TIMES                      |   183 |  2928 |    18   (0)| 00:00:01 |       |       |  |* 24 |              BITMAP INDEX RANGE SCAN| SALES_TIME_BIX             |       |       |            |          |KEY(SQ)|KEY(SQ)|  |  25 |            BITMAP MERGE             |                            |       |       |            |          |       |       |  |  26 |             BITMAP KEY ITERATION    |                            |       |       |            |          |       |       |  |  27 |              BUFFER SORT            |                            |       |       |            |          |       |       |  |  28 |               TABLE ACCESS FULL     | SYS_TEMP_0FD9D660D_466CBF7 |   383 |  1915 |     2   (0)| 00:00:01 |       |       |  |* 29 |              BITMAP INDEX RANGE SCAN| SALES_CUST_BIX             |       |       |            |          |KEY(SQ)|KEY(SQ)|  |  30 |         TABLE ACCESS BY USER ROWID  | SALES                      |     1 |    29 |    70   (0)| 00:00:01 | ROWID | ROWID |  |  31 |      TABLE ACCESS FULL              | SYS_TEMP_0FD9D660D_466CBF7 |   383 |  5745 |     2   (0)| 00:00:01 |       |       |  ----------------------------------------------------------------------------------------------------------------------------------    Predicate Information (identified by operation id):  ---------------------------------------------------       3 - filter("C"."CUST_STATE_PROVINCE"='CA')     5 - access("ITEM_1"="CH"."CHANNEL_ID")     6 - filter("CH"."CHANNEL_DESC"='Catalog' OR "CH"."CHANNEL_DESC"='Internet')     7 - access("ITEM_2"="C0")     8 - access("ITEM_3"="T"."TIME_ID")     9 - filter("T"."CALENDAR_QUARTER_DESC"='1999-Q1' OR "T"."CALENDAR_QUARTER_DESC"='1999-Q2')    18 - filter("CH"."CHANNEL_DESC"='Catalog' OR "CH"."CHANNEL_DESC"='Internet')    19 - access("S"."CHANNEL_ID"="CH"."CHANNEL_ID")    23 - filter("T"."CALENDAR_QUARTER_DESC"='1999-Q1' OR "T"."CALENDAR_QUARTER_DESC"='1999-Q2')    24 - access("S"."TIME_ID"="T"."TIME_ID")    29 - access("S"."CUST_ID"="C0")    Note  -----     - star transformation used for this statement  

其實經過星形轉換後,Oracle對上面的SQL語句進行了查詢重寫。以上SQL需要在相關的事實表上有點陣圖索引,否則依然不能使用星型轉換:

create bitmap index cust_id_ind on sh.sales(cust_id) local;  create bitmap index time_id_ind on sh.sales(time_id) local;  create bitmap index channel_id_ind on sh.sales(channel_id) local;  

本文選自《Oracle程式設計師面試筆試寶典》,作者:李華榮。