【DB笔试面试591】在Oracle中,什么是星型连接(Star Join)?

  • 2019 年 10 月 10 日
  • 笔记

题目部分

在Oracle中,什么是星型连接(Star Join)?

答案部分

星型连接(Star Join)通常用于数据仓库类型的应用,它是一种单个事实表(Fact Table)和多个维度表(Dimension Table)之间的连接。星型连接的各维度表之间没有直接的关联条件,其事实表和各维度表之间是基于事实表的外键列和对应维度表的主键列之间的连接,并且通常在事实表的外键列上还会存在对应的位图索引。星型转换可以有效改善大的事实表与多个具有良好选择率的维度表间连接的查询,从而有效避免了全表扫描的性能窘境。

星型转换由初始化参数STAR_TRANSFORMATION_ENABLED控制,该参数可以有三种选项:

① TRUE:CBO优化器自动识别语句中的事实表和约束维度表并进行星型转换。CBO优化器需要确定转换后的执行计划成本要低于不转换的执行计划.如果利用物化的临时表性能更高,那么CBO优化器还会尝试利用物化的临时表。

② FALSE:优化器不会考虑星型转换,为参数STAR_TRANSFORMATION_ENABLED的默认值。

③ TEMP_DISABLE:当一个维度表超过100个块时,如果简单地设置STAR_TRANSFORMATION_ENABLED为TRUE来启用星型变换,那么会话会创建一个内存中的全局临时表(Global Temporary Table)来保存已过滤的维度数据,这在过去会造成很多问题。这里所说的100个块其实是由隐含参数“_temp_tran_block_threshold”(number of blocks for a dimension before we temp transform)来控制的。此外,隐含参数“_temp_tran_cache”(determines if temp table is created with cache option,默认为TRUE)决定了这类临时表是否被缓存住。为了避免创建全局临时表可能带来的问题,就可以用到TEMP_DISABLE这个禁用临时表的选项,让优化器不再考虑使用物化的临时表。

参数STAR_TRANSFORMATION_ENABLED的默认值为FALSE,因为星型转换适用的场景是数据仓库环境中具有星型模型的模式,而且需要事实表的各个连接列上均有良好的索引时才能发挥其优势。如果能确定以上因素,那么就可以使用星型转换。

下面给出一个示例:

SYS@orclasm > show parameter star_transformation_enabled    NAME                                 TYPE        VALUE  ------------------------------------ ----------- ------------------------------  star_transformation_enabled          string      FALSE    SYS@orclasm > conn sh/sh  Connected.  SH@orclasm > set autotr trace exp;  SH@orclasm > SELECT CH.CHANNEL_CLASS,    2         C.CUST_CITY,    3         T.CALENDAR_QUARTER_DESC,    4         SUM(S.AMOUNT_SOLD) SALES_AMOUNT    5    FROM SALES S, TIMES T, CUSTOMERS C, CHANNELS CH    6   WHERE S.TIME_ID = T.TIME_ID    7     AND S.CUST_ID = C.CUST_ID    8     AND S.CHANNEL_ID = CH.CHANNEL_ID    9     AND C.CUST_STATE_PROVINCE = 'CA'   10     AND CH.CHANNEL_DESC IN ('Internet', 'Catalog')   11     AND T.CALENDAR_QUARTER_DESC IN ('1999-Q1', '1999-Q2')   12   GROUP BY CH.CHANNEL_CLASS, C.CUST_CITY, T.CALENDAR_QUARTER_DESC;    Execution Plan  ----------------------------------------------------------    Plan hash value: 593420798    -------------------------------------------------------------------------------------------------------------  | Id  | Operation                       | Name      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |  -------------------------------------------------------------------------------------------------------------  |   0 | SELECT STATEMENT                |           |   572 | 48048 |   964   (3)| 00:00:12 |       |       |  |   1 |  HASH GROUP BY                  |           |   572 | 48048 |   964   (3)| 00:00:12 |       |       |  |*  2 |   HASH JOIN                     |           |  3116 |   255K|   963   (3)| 00:00:12 |       |       |  |*  3 |    TABLE ACCESS FULL            | CHANNELS  |     2 |    42 |     3   (0)| 00:00:01 |       |       |  |*  4 |    HASH JOIN                    |           |  6231 |   383K|   960   (3)| 00:00:12 |       |       |  |   5 |     PART JOIN FILTER CREATE     | :BF0000   |   183 |  2928 |    18   (0)| 00:00:01 |       |       |  |*  6 |      TABLE ACCESS FULL          | TIMES     |   183 |  2928 |    18   (0)| 00:00:01 |       |       |  |*  7 |     HASH JOIN                   |           | 49822 |  2286K|   941   (3)| 00:00:12 |       |       |  |*  8 |      TABLE ACCESS FULL          | CUSTOMERS |   383 |  9958 |   406   (1)| 00:00:05 |       |       |  |   9 |      PARTITION RANGE JOIN-FILTER|           |   918K|    18M|   530   (3)| 00:00:07 |:BF0000|:BF0000|  |  10 |       TABLE ACCESS FULL         | SALES     |   918K|    18M|   530   (3)| 00:00:07 |:BF0000|:BF0000|  -------------------------------------------------------------------------------------------------------------    Predicate Information (identified by operation id):  ---------------------------------------------------       2 - access("S"."CHANNEL_ID"="CH"."CHANNEL_ID")     3 - filter("CH"."CHANNEL_DESC"='Catalog' OR "CH"."CHANNEL_DESC"='Internet')     4 - access("S"."TIME_ID"="T"."TIME_ID")     6 - filter("T"."CALENDAR_QUARTER_DESC"='1999-Q1' OR "T"."CALENDAR_QUARTER_DESC"='1999-Q2')     7 - access("S"."CUST_ID"="C"."CUST_ID")     8 - filter("C"."CUST_STATE_PROVINCE"='CA')  

发现上面的执行计划与三个表连接时都是全表扫描,下面打开星形转换后再执行同样的SQL语句:

SYS@orclasm > alter system set star_transformation_enabled=true;    System altered.    SH@orclasm > SH@orclasm > set autotr trace exp;  SH@orclasm > SELECT CH.CHANNEL_CLASS,    2         C.CUST_CITY,    3         T.CALENDAR_QUARTER_DESC,    4         SUM(S.AMOUNT_SOLD) SALES_AMOUNT    FROM SALES S, TIMES T, CUSTOMERS C, CHANNELS CH   WHERE S.TIME_ID = T.TIME_ID     AND S.CUST_ID = C.CUST_ID     AND S.CHANNEL_ID = CH.CHANNEL_ID     AND C.CUST_STATE_PROVINCE = 'CA'     AND CH.CHANNEL_DESC IN ('Internet', 'Catalog')     AND T.CALENDAR_QUARTER_DESC IN ('1999-Q1', '1999-Q2')   GROUP BY CH.CHANNEL_CLASS, C.CUST_CITY, T.CALENDAR_QUARTER_DESC;    Execution Plan  ----------------------------------------------------------    Plan hash value: 478886510    ----------------------------------------------------------------------------------------------------------------------------------  | Id  | Operation                           | Name                       | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |  ----------------------------------------------------------------------------------------------------------------------------------  |   0 | SELECT STATEMENT                    |                            |   253 | 20493 |   556   (1)| 00:00:07 |       |       |  |   1 |  TEMP TABLE TRANSFORMATION          |                            |       |       |            |          |       |       |  |   2 |   LOAD AS SELECT                    | SYS_TEMP_0FD9D660D_466CBF7 |       |       |            |          |       |       |  |*  3 |    TABLE ACCESS FULL                | CUSTOMERS                  |   383 |  9958 |   406   (1)| 00:00:05 |       |       |  |   4 |   HASH GROUP BY                     |                            |   253 | 20493 |   150   (2)| 00:00:02 |       |       |  |*  5 |    HASH JOIN                        |                            |   253 | 20493 |   149   (2)| 00:00:02 |       |       |  |*  6 |     TABLE ACCESS FULL               | CHANNELS                   |     2 |    42 |     3   (0)| 00:00:01 |       |       |  |*  7 |     HASH JOIN                       |                            |   253 | 15180 |   146   (2)| 00:00:02 |       |       |  |*  8 |      HASH JOIN                      |                            |   253 | 11385 |   143   (1)| 00:00:02 |       |       |  |*  9 |       TABLE ACCESS FULL             | TIMES                      |   183 |  2928 |    18   (0)| 00:00:01 |       |       |  |  10 |       VIEW                          | VW_ST_62EEF96F             |   254 |  7366 |   125   (1)| 00:00:02 |       |       |  |  11 |        NESTED LOOPS                 |                            |   254 | 14478 |   101   (0)| 00:00:02 |       |       |  |  12 |         PARTITION RANGE SUBQUERY    |                            |   254 |  7117 |    55   (2)| 00:00:01 |KEY(SQ)|KEY(SQ)|  |  13 |          BITMAP CONVERSION TO ROWIDS|                            |   254 |  7117 |    55   (2)| 00:00:01 |       |       |  |  14 |           BITMAP AND                |                            |       |       |            |          |       |       |  |  15 |            BITMAP MERGE             |                            |       |       |            |          |       |       |  |  16 |             BITMAP KEY ITERATION    |                            |       |       |            |          |       |       |  |  17 |              BUFFER SORT            |                            |       |       |            |          |       |       |  |* 18 |               TABLE ACCESS FULL     | CHANNELS                   |     2 |    26 |     3   (0)| 00:00:01 |       |       |  |* 19 |              BITMAP INDEX RANGE SCAN| SALES_CHANNEL_BIX          |       |       |            |          |KEY(SQ)|KEY(SQ)|  |  20 |            BITMAP MERGE             |                            |       |       |            |          |       |       |  |  21 |             BITMAP KEY ITERATION    |                            |       |       |            |          |       |       |  |  22 |              BUFFER SORT            |                            |       |       |            |          |       |       |  |* 23 |               TABLE ACCESS FULL     | TIMES                      |   183 |  2928 |    18   (0)| 00:00:01 |       |       |  |* 24 |              BITMAP INDEX RANGE SCAN| SALES_TIME_BIX             |       |       |            |          |KEY(SQ)|KEY(SQ)|  |  25 |            BITMAP MERGE             |                            |       |       |            |          |       |       |  |  26 |             BITMAP KEY ITERATION    |                            |       |       |            |          |       |       |  |  27 |              BUFFER SORT            |                            |       |       |            |          |       |       |  |  28 |               TABLE ACCESS FULL     | SYS_TEMP_0FD9D660D_466CBF7 |   383 |  1915 |     2   (0)| 00:00:01 |       |       |  |* 29 |              BITMAP INDEX RANGE SCAN| SALES_CUST_BIX             |       |       |            |          |KEY(SQ)|KEY(SQ)|  |  30 |         TABLE ACCESS BY USER ROWID  | SALES                      |     1 |    29 |    70   (0)| 00:00:01 | ROWID | ROWID |  |  31 |      TABLE ACCESS FULL              | SYS_TEMP_0FD9D660D_466CBF7 |   383 |  5745 |     2   (0)| 00:00:01 |       |       |  ----------------------------------------------------------------------------------------------------------------------------------    Predicate Information (identified by operation id):  ---------------------------------------------------       3 - filter("C"."CUST_STATE_PROVINCE"='CA')     5 - access("ITEM_1"="CH"."CHANNEL_ID")     6 - filter("CH"."CHANNEL_DESC"='Catalog' OR "CH"."CHANNEL_DESC"='Internet')     7 - access("ITEM_2"="C0")     8 - access("ITEM_3"="T"."TIME_ID")     9 - filter("T"."CALENDAR_QUARTER_DESC"='1999-Q1' OR "T"."CALENDAR_QUARTER_DESC"='1999-Q2')    18 - filter("CH"."CHANNEL_DESC"='Catalog' OR "CH"."CHANNEL_DESC"='Internet')    19 - access("S"."CHANNEL_ID"="CH"."CHANNEL_ID")    23 - filter("T"."CALENDAR_QUARTER_DESC"='1999-Q1' OR "T"."CALENDAR_QUARTER_DESC"='1999-Q2')    24 - access("S"."TIME_ID"="T"."TIME_ID")    29 - access("S"."CUST_ID"="C0")    Note  -----     - star transformation used for this statement  

其实经过星形转换后,Oracle对上面的SQL语句进行了查询重写。以上SQL需要在相关的事实表上有位图索引,否则依然不能使用星型转换:

create bitmap index cust_id_ind on sh.sales(cust_id) local;  create bitmap index time_id_ind on sh.sales(time_id) local;  create bitmap index channel_id_ind on sh.sales(channel_id) local;  

本文选自《Oracle程序员面试笔试宝典》,作者:李华荣。