一个标量子查询联想到with as改写方案

原创
2016-06-07 16:36:53 1005浏览

群里有朋友提到下列sql语句如何避免反复扫描t_dict表 select o.order_id, (SELECT DICT_NAME FROM T_DICT WHERE DICT_TYPEID = 'ORDERSTAT' AND DICT_ID = o.STAT) AS STAT, (SELECT DICT_NAME FROM T_DICT WHERE DICT_TYPEID = 'PAYSTAT' AND DICT_ID = o.P

群里有朋友提到下列sql语句如何避免反复扫描t_dict表
select o.order_id,
(SELECT DICT_NAME
FROM T_DICT
WHERE DICT_TYPEID = 'ORDERSTAT'
AND DICT_ID = o.STAT) AS STAT,
(SELECT DICT_NAME
FROM T_DICT
WHERE DICT_TYPEID = 'PAYSTAT'
AND DICT_ID = o.PAY_STAT) AS PAY_STAT,
(SELECT DICT_NAME
FROM T_DICT
WHERE DICT_TYPEID = 'ACTSTAT'
AND DICT_ID = o.ACT_STAT) AS ACT_STAT
FROM T_ORDER o

另一个朋友提到用下列sql语句来避免反复扫描子表t_dict
/* Formatted on 2014/6/12 17:57:18 (QP5 v5.149.1003.31008) */
WITH vts1
AS (SELECT o.order_id, d.DICT_NAME, d.DICT_TYPEID
FROM T_DICT d, T_ORDER o
WHERE ( d.DICT_ID = o.STAT
OR d.DICT_ID = o.PAY_STAT
OR d.DICT_ID = o.ACT_STAT)
AND d.DICT_TYPEID IN ('ORDERSTAT', 'PAYSTAT', 'ACTSTAT'))
SELECT order_id,
(SELECT DICT_NAME
FROM vts1
WHERE DICT_TYPEID = 'ORDERSTAT')
AS stat,
(SELECT DICT_NAME
FROM vts1
WHERE DICT_TYPEID = 'PAYSTAT')
AS stat,
(SELECT DICT_NAME
FROM vts1
WHERE DICT_TYPEID = 'ACTSTAT')
AS stat
FROM vts1;

上面的sql是一个标量子查询,而小鱼手动构造了下列表t01和t02
SQL> create table t01 as select * from dba_objects;

Table created.
SQL> create table t02 as select * from dba_objects;

Table created.
SQL> alter table t01 add id number;

Table altered.
SQL> update t01 set id=object_id;

50328 rows updated.

SQL> commit;

SQL> SELECT t01.object_id,
2 (SELECT object_name
3 FROM t02
4 WHERE t02.object_type = 'TABLE' AND t02.object_id = t01.object_id)
5 col1,
6 (SELECT object_name
7 FROM t02
8 WHERE t02.object_type = 'INDEX'
9 AND t02.object_id = t01.data_object_id)
10 col2,
11 (SELECT object_name
12 FROM t02
13 WHERE t02.object_type = 'VIEW' AND t02.object_id = t01.id)
14 col3
15 FROM t01;

50328 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3013675264

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 46210 | 1759K| 155 (2)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| T02 | 20 | 1800 | 17 (0)| 00:00:01 |
|* 2 | TABLE ACCESS FULL| T02 | 18 | 1620 | 18 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| T02 | 33 | 2970 | 7 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL| T01 | 46210 | 1759K| 155 (2)| 00:00:02 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("T02"."OBJECT_TYPE"='TABLE' AND "T02"."OBJECT_ID"=:B1)
2 - filter("T02"."OBJECT_TYPE"='INDEX' AND "T02"."OBJECT_ID"=:B1)
3 - filter("T02"."OBJECT_TYPE"='VIEW' AND "T02"."OBJECT_ID"=:B1)

Note
-----
- dynamic sampling used for this statement

Statistics
----------------------------------------------------------
92 recursive calls
0 db block gets
72905089 consistent gets
0 physical reads
0 redo size
1241710 bytes sent via SQL*Net to client
37397 bytes received via SQL*Net from client
3357 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
50328 rows processed
由于重复值较少,标量子查询的执行成本很高,不要只看上面的cost是155,这个只是一个循环的成本,我们看下逻辑读高达了72905089,一般这种sql我们优先想到的是将标量子查询写成外连接

SQL> SELECT t01.object_id,
2 a.object_name,
3 b.object_name,
4 c.object_name
5 FROM t01,
6 (SELECT object_name, object_id
7 FROM t02
8 WHERE t02.object_type = 'TABLE') a,
9 (SELECT object_name, object_id
10 FROM t02
11 WHERE t02.object_type = 'INDEX') b,
12 (SELECT object_name, object_id
13 FROM t02
14 WHERE t02.object_type = 'VIEW') c
15 WHERE t01.object_id = a.object_id(+)
16 AND t01.data_object_id = b.object_id(+)
17 AND t01.id = c.object_id(+);

50328 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3250688725

--------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

--------------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 46210 | 13M| 620 (2)| 00:00:08 |

|* 1 | HASH JOIN RIGHT OUTER | | 46210 | 13M| 620 (2)| 00:00:08 |

|* 2 | TABLE ACCESS FULL | T02 | 3262 | 286K| 154 (1)| 00:00:02 |

|* 3 | HASH JOIN RIGHT OUTER | | 46210 | 9882K| 465 (2)| 00:00:06 |

|* 4 | TABLE ACCESS FULL | T02 | 2032 | 178K| 154 (1)| 00:00:02 |

|* 5 | HASH JOIN RIGHT OUTER| | 46210 | 5821K| 310 (2)| 00:00:04 |

|* 6 | TABLE ACCESS FULL | T02 | 1845 | 162K| 154 (1)| 00:00:02 |

| 7 | TABLE ACCESS FULL | T01 | 46210 | 1759K| 155 (2)| 00:00:02 |

--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("T01"."ID"="OBJECT_ID"(+))
2 - filter("T02"."OBJECT_TYPE"(+)='VIEW')
3 - access("T01"."OBJECT_ID"="OBJECT_ID"(+))
4 - filter("T02"."OBJECT_TYPE"(+)='TABLE')
5 - access("T01"."DATA_OBJECT_ID"="OBJECT_ID"(+))
6 - filter("T02"."OBJECT_TYPE"(+)='INDEX')

Note
-----
- dynamic sampling used for this statement

Statistics
----------------------------------------------------------
17 recursive calls
0 db block gets
6327 consistent gets
0 physical reads
0 redo size
1241731 bytes sent via SQL*Net to client
37397 bytes received via SQL*Net from client
3357 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
50328 rows processed

改成外连接后,逻辑读降到了6327,sql的相应时间大幅度降低,那么是否还可以写成只扫描一次子查询的表t02了,当然也可以尝试改写成with as的写法,尝试了半天,写出来感觉业务逻辑始终存在问题,而上面最先提出的例子给出的with as的写法,小鱼个人觉得也是有问题的

WITH vts1
AS (SELECT o.order_id, d.DICT_NAME, d.DICT_TYPEID
FROM T_DICT d, T_ORDER o
WHERE ( d.DICT_ID = o.STAT
OR d.DICT_ID = o.PAY_STAT
OR d.DICT_ID = o.ACT_STAT)
AND d.DICT_TYPEID IN ('ORDERSTAT', 'PAYSTAT', 'ACTSTAT'))
我们来看看上面这段构造的表vts1,很明显如果返回超过三行数据,那么下面的标量子查询无疑会报错
SELECT order_id,
(SELECT DICT_NAME
FROM vts1
WHERE DICT_TYPEID = 'ORDERSTAT')
AS stat
FROM vts1;

这里只截取一个查询,换句话说如果vts1临时表的dict_typeid=’ORDERSTAT’返回了多行,那么这个标量子查询肯定会报错。

特别明显的就是如果我们构造一个with as的表,其实数据并不需要绝对的满足with as的条件,因为标量子查询是类似于外连接的,子表不存在满足的数据则补全null,当然如果t02这个表比较大,我们也是可以用with as去优化,类似于物化部分结果集为临时表。
SQL> WITH m AS (SELECT object_name, object_id, object_type
2 FROM t02
3 WHERE object_type IN ('TABLE', 'INDEX', 'VIEW'))
4 SELECT t01.object_id,
5 a.object_name,
6 b.object_name,
7 c.object_name
8 FROM t01,
9 (SELECT object_name, object_id
10 FROM m
11 WHERE m.object_type = 'TABLE') a,
12 (SELECT object_name, object_id
13 FROM m
14 WHERE m.object_type = 'INDEX') b,
15 (SELECT object_name, object_id
16 FROM m
17 WHERE m.object_type = 'VIEW') c
18 WHERE t01.object_id = a.object_id(+)
19 AND t01.data_object_id = b.object_id(+)
20 AND t01.id = c.object_id(+);

50293 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3949065611

--------------------------------------------------------------------------------
------------------------
| Id | Operation | Name | Rows | Bytes |
Cost (%CPU)| Time |
--------------------------------------------------------------------------------
------------------------
| 0 | SELECT STATEMENT | | 50293 | 13M|
343 (3)| 00:00:05 |
| 1 | TEMP TABLE TRANSFORMATION | | | |
| |
| 2 | LOAD AS SELECT | | | |
| |
|* 3 | TABLE ACCESS FULL | T02 | 6856 | 261K|
155 (2)| 00:00:02 |
|* 4 | HASH JOIN RIGHT OUTER | | 50293 | 13M|
188 (3)| 00:00:03 |
|* 5 | VIEW | | 6856 | 602K|
10 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL | SYS_TEMP_0FD9D661A_2EC7C2 | 6856 | 261K|
10 (0)| 00:00:01 |
|* 7 | HASH JOIN RIGHT OUTER | | 50293 | 9429K|
177 (3)| 00:00:03 |
|* 8 | VIEW | | 6856 | 602K|
10 (0)| 00:00:01 |
| 9 | TABLE ACCESS FULL | SYS_TEMP_0FD9D661A_2EC7C2 | 6856 | 261K|
10 (0)| 00:00:01 |
|* 10 | HASH JOIN RIGHT OUTER | | 50293 | 5009K|
166 (2)| 00:00:02 |
|* 11 | VIEW | | 6856 | 602K|
10 (0)| 00:00:01 |
| 12 | TABLE ACCESS FULL | SYS_TEMP_0FD9D661A_2EC7C2 | 6856 | 261K|
10 (0)| 00:00:01 |
| 13 | TABLE ACCESS FULL | T01 | 50293 | 589K|
155 (2)| 00:00:02 |
--------------------------------------------------------------------------------
------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("OBJECT_TYPE"='INDEX' OR "OBJECT_TYPE"='TABLE' OR "OBJECT_TYPE"='V
IEW')
4 - access("T01"."ID"="OBJECT_ID"(+))
5 - filter("M"."OBJECT_TYPE"(+)='VIEW')
7 - access("T01"."DATA_OBJECT_ID"="OBJECT_ID"(+))
8 - filter("M"."OBJECT_TYPE"(+)='INDEX')
10 - access("T01"."OBJECT_ID"="OBJECT_ID"(+))
11 - filter("M"."OBJECT_TYPE"(+)='TABLE')

Statistics
----------------------------------------------------------
2 recursive calls
40 db block gets
4810 consistent gets
33 physical reads
648 redo size
1240850 bytes sent via SQL*Net to client
37364 bytes received via SQL*Net from client
3354 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
50293 rows processed
群里同学给出的那个with as的sql语句,小鱼个人并不赞同。

既然提到了with as,下面我们就来对with as做一个简单的介绍

With as主要体现在增加sql易读性和减少IO成本,增加易读性比较好理解,比如查询业务部分所有人的平均薪水
SQL> select mgr,avg(sal) avg_sal from emp group by mgr;

MGR AVG_SAL
---------- ----------
7839 2758.33333
5000
7782 1300
7698 1310
7902 800
7566 3000
7788 1100

7 rows selected.
SQL> with avgsal as (select mgr,avg(sal) avg_sal from emp group by mgr)
2 select * from avgsal;

7 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 4067220884

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6 | 48 | 4 (25)| 00:00:01 |
| 1 | HASH GROUP BY | | 6 | 48 | 4 (25)| 00:00:01 |
| 2 | TABLE ACCESS FULL| EMP | 14 | 112 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
715 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
7 rows processed

还有一个比较突出的作用就是减少重复部分IO成本,此时oracle会对重复部分构造一个临时表,而后oracle只会扫描这个临时表来减少重复IO带来的资源消耗。

下面我们来构造一个with as的改写的案例
SQL> SELECT COUNT (*) cnt, object_type
2 FROM t01
3 WHERE object_type = 'TABLE'
4 GROUP BY object_type
5 UNION ALL
6 SELECT COUNT (*), object_type
7 FROM t01
8 WHERE object_type = 'INDEX'
9 GROUP BY object_type
10 UNION ALL
11 SELECT COUNT (*), object_type
12 FROM t01
13 WHERE object_type = 'VIEW'
14 GROUP BY object_type;

CNT OBJECT_TYPE
---------- -------------------
1610 TABLE
1721 INDEX
3672 VIEW

Execution Plan
----------------------------------------------------------
Plan hash value: 2988477672

------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 27 | 463 (67)| 00:00:06 |
| 1 | UNION-ALL | | | | | |
| 2 | SORT GROUP BY NOSORT| | 1 | 9 | 154 (1)| 00:00:02 |
|* 3 | TABLE ACCESS FULL | T01 | 1734 | 15606 | 154 (1)| 00:00:02 |
| 4 | SORT GROUP BY NOSORT| | 1 | 9 | 154 (1)| 00:00:02 |
|* 5 | TABLE ACCESS FULL | T01 | 1734 | 15606 | 154 (1)| 00:00:02 |
| 6 | SORT GROUP BY NOSORT| | 1 | 9 | 154 (1)| 00:00:02 |
|* 7 | TABLE ACCESS FULL | T01 | 1734 | 15606 | 154 (1)| 00:00:02 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("OBJECT_TYPE"='TABLE')
5 - filter("OBJECT_TYPE"='INDEX')
7 - filter("OBJECT_TYPE"='VIEW')

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2088 consistent gets
0 physical reads
0 redo size
662 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed

SQL> WITH a AS ( SELECT COUNT (*) cnt, object_type
2 FROM t01
3 GROUP BY object_type)
4 SELECT cnt, object_type
5 FROM a
6 WHERE a.object_type = 'TABLE'
7 UNION ALL
8 SELECT cnt, object_type
9 FROM a
10 WHERE a.object_type = 'INDEX'
11 UNION ALL
12 SELECT cnt, object_type
13 FROM a
14 WHERE a.object_type = 'VIEW';

CNT OBJECT_TYPE
---------- -------------------
1610 TABLE
1721 INDEX
3672 VIEW

Execution Plan
----------------------------------------------------------
Plan hash value: 48651815

--------------------------------------------------------------------------------
------------------------
| Id | Operation | Name | Rows | Bytes |
Cost (%CPU)| Time |
--------------------------------------------------------------------------------
------------------------
| 0 | SELECT STATEMENT | | 87 | 2088 |
6 (67)| 00:00:01 |
| 1 | TEMP TABLE TRANSFORMATION | | | |
| |
| 2 | LOAD AS SELECT | | | |
| |
| 3 | HASH GROUP BY | | 29 | 261 |
158 (4)| 00:00:02 |
| 4 | TABLE ACCESS FULL | T01 | 50293 | 442K|
154 (1)| 00:00:02 |
| 5 | UNION-ALL | | | |
| |
|* 6 | VIEW | | 29 | 696 |
2 (0)| 00:00:01 |
| 7 | TABLE ACCESS FULL | SYS_TEMP_0FD9D660F_2EC7C2 | 29 | 261 |
2 (0)| 00:00:01 |
|* 8 | VIEW | | 29 | 696 |
2 (0)| 00:00:01 |
| 9 | TABLE ACCESS FULL | SYS_TEMP_0FD9D660F_2EC7C2 | 29 | 261 |
2 (0)| 00:00:01 |
|* 10 | VIEW | | 29 | 696 |
2 (0)| 00:00:01 |
| 11 | TABLE ACCESS FULL | SYS_TEMP_0FD9D660F_2EC7C2 | 29 | 261 |
2 (0)| 00:00:01 |
--------------------------------------------------------------------------------
------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

6 - filter("A"."OBJECT_TYPE"='TABLE')
8 - filter("A"."OBJECT_TYPE"='INDEX')
10 - filter("A"."OBJECT_TYPE"='VIEW')

Statistics
----------------------------------------------------------
102 recursive calls
11 db block gets
719 consistent gets
1 physical reads
1584 redo size
662 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed

这里我们看下with as的执行计划中存在了TEMP TABLE TRANSFORMATION和LOAD AS SELECT,这里是将with as组成的结果集转换为了一个临时表,而后面的查询则可以使用这个临时表,由于这个临时表一般比原来的表或者表连接的数据量小,所以无论IO还是cpu成本都相对原sql语句都减小了。

oracle一般会在重复使用两次的地方自动将with as转化为临时表,这个思想有点类似物化视图的含义,比如有些sql语句我们无法将其优化,将sql语句查询返回的结果集(with as也是如此)物化为一个实体的表,而这个表的访问成本大大小于之前返回该结果集的方式(比如访问该结果集要采取全表扫描、多表关联hash join、nested loop等)

同样即使with as转换的结果集我们只访问一次,也可以加上hint materialize将其构造为临时表。
SQL> WITH a AS ( SELECT /*+materialize*/COUNT (*) cnt, object_type
2 FROM t01
3 GROUP BY object_type)
4 SELECT cnt, object_type
5 FROM a where a.object_type='TABLE';

CNT OBJECT_TYPE
---------- -------------------
1610 TABLE

Execution Plan
----------------------------------------------------------
Plan hash value: 2073943260

--------------------------------------------------------------------------------
------------------------
| Id | Operation | Name | Rows | Bytes |
Cost (%CPU)| Time |
--------------------------------------------------------------------------------
------------------------
| 0 | SELECT STATEMENT | | 29 | 696 |
160 (4)| 00:00:02 |
| 1 | TEMP TABLE TRANSFORMATION | | | |
| |
| 2 | LOAD AS SELECT | | | |
| |
| 3 | HASH GROUP BY | | 29 | 261 |
158 (4)| 00:00:02 |
| 4 | TABLE ACCESS FULL | T01 | 50293 | 442K|
154 (1)| 00:00:02 |
|* 5 | VIEW | | 29 | 696 |
2 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6616_2EC7C2 | 29 | 261 |
2 (0)| 00:00:01 |
--------------------------------------------------------------------------------
------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

5 - filter("A"."OBJECT_TYPE"='TABLE')

Statistics
----------------------------------------------------------
102 recursive calls
10 db block gets
713 consistent gets
1 physical reads
1468 redo size
586 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

生产环境的一个sql语句:
SELECT t1.contactid,
t1.skillid,
t2.turntime starttime,
t2.intime
FROM call.base_contactstate t1,
( SELECT t.contactid,
MIN (t.starttime) turntime,
MAX (t.starttime) intime
FROM call.base_contactstate t
WHERE t.eventid = 806355209 AND t.contactstatus = 2
GROUP BY t.contactid) t2
WHERE t1.contactid = t2.contactid
AND t1.starttime = t2.intime
AND t1.eventid = 806355209
AND t1.contactstatus = 2;

Execution Plan
----------------------------------------------------------
Plan hash value: 2829800415

--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 145 | | 76953 (2)| 00:15:24 |
|* 1 | FILTER | | | | | | |
| 2 | HASH GROUP BY | | 1 | 145 | | 76953 (2)| 00:15:24 |
|* 3 | HASH JOIN | | 114K| 15M| 8272K| 76943 (2)| 00:15:24 |
|* 4 | TABLE ACCESS FULL| BASE_CONTACTSTATE | 109K| 6977K| | 38029 (2)| 00:07:37 |
|* 5 | TABLE ACCESS FULL| BASE_CONTACTSTATE | 109K| 8587K| | 38029 (2)| 00:07:37 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("T1"."STARTTIME"=MAX("T"."STARTTIME"))
3 - access("T1"."CONTACTID"="T"."CONTACTID")
4 - filter("T"."EVENTID"=806355209 AND "T"."CONTACTSTATUS"=2)
5 - filter("T1"."EVENTID"=806355209 AND "T1"."CONTACTSTATUS"=2)

Statistics
----------------------------------------------------------
666 recursive calls
0 db block gets
324498 consistent gets
336648 physical reads
0 redo size
52016865 bytes sent via SQL*Net to client
397768 bytes received via SQL*Net from client
36118 SQL*Net roundtrips to/from client
21 sorts (memory)
0 sorts (disk)
541747 rows processed

这个sql语句如果想避免base_contactstate的全表扫描很难,但是我们注意到全表扫描了两次base_contactstate表,我们尝试利用下with as物化这个表的部分结果集为临时表来减小IO成本。
WITH t AS (SELECT contactid, skillid, starttime
FROM call.base_contactstate
WHERE eventid = 806355209 AND contactstatus = 2)
SELECT t1.contactid,
t1.skillid,
t2.turntime starttime,
t2.intime
FROM t t1,
( SELECT t.contactid,
MIN (t.starttime) turntime,
MAX (t.starttime) intime
FROM t
GROUP BY t.contactid) t2
WHERE t1.contactid = t2.contactid AND t1.starttime = t2.intime

Execution Plan
----------------------------------------------------------
Plan hash value: 1354327316

------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 183 | | 41280 (2)| 00:08:16 |
| 1 | TEMP TABLE TRANSFORMATION | | | | | | |
| 2 | LOAD AS SELECT | | | | | | |
|* 3 | TABLE ACCESS FULL | BASE_CONTACTSTATE | 109K| 7299K| | 38029 (2)| 00:07:37 |
|* 4 | HASH JOIN | | 1 | 183 | 9984K| 3251 (1)| 00:00:40 |
| 5 | VIEW | | 109K| 8694K| | 227 (2)| 00:00:03 |
| 6 | TABLE ACCESS FULL | SYS_TEMP_0FD9D66ED_3363B5A2 | 109K| 6225K| | 227 (2)| 00:00:03 |
| 7 | VIEW | | 109K| 10M| | 1942 (1)| 00:00:24 |
| 8 | HASH GROUP BY | | 109K| 6977K| 8232K| 1942 (1)| 00:00:24 |
| 9 | VIEW | | 109K| 6977K| | 227 (2)| 00:00:03 |
| 10 | TABLE ACCESS FULL | SYS_TEMP_0FD9D66ED_3363B5A2 | 109K| 6225K| | 227 (2)| 00:00:03 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("EVENTID"=806355209 AND "CONTACTSTATUS"=2)
4 - access("T1"."CONTACTID"="T2"."CONTACTID" AND "T1"."STARTTIME"="T2"."INTIME")

Statistics
----------------------------------------------------------
1937 recursive calls
5572 db block gets
173790 consistent gets
173540 physical reads
1768 redo size
52132593 bytes sent via SQL*Net to client
397768 bytes received via SQL*Net from client
36118 SQL*Net roundtrips to/from client
44 sorts (memory)
0 sorts (disk)
541747 rows processed

这里我们用with as改写后,逻辑读和cost成本都降低了接近一半的样子,上次在群里讨论到sql优化,结果群里的有位朋友居然说到没有技术含量可言,在小鱼看来cbo是oracle最复杂的部分,没有之一。

其实在with as的改写后,oracle的优化器会计算是否用临时表的方式来完成查询,如果临时表的执行计划计算而来的cost较大,oracle还是会选择之前的方式,即使已经改写成了with as的临时表的sql语句

SQL> SELECT COUNT (1)
2 FROM (SELECT s.*
3 FROM call.HF_DM_SAMPLE s
4 INNER JOIN
5 ( SELECT biz_sheet_no bizno, MAX (project_task_times) times
6 FROM call.HF_DM_SAMPLE
7 WHERE project_id = '000000000001'
8 GROUP BY biz_sheet_no) tb_tmp
9 ON tb_tmp.bizno = s.biz_sheet_no
10 AND tb_tmp.times = s.project_task_times
11 AND s.blank0 LIKE '%160173576%') c__;

Plan hash value: 1949914683

---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 13469 (1)| 00:02:42 |
| 1 | SORT AGGREGATE | | 1 | | | |
| 2 | VIEW | | 1431 | | 13469 (1)| 00:02:42 |
|* 3 | FILTER | | | | | |
| 4 | HASH GROUP BY | | 1431 | 95877 | 13469 (1)| 00:02:42 |
|* 5 | HASH JOIN | | 10011 | 655K| 13468 (1)| 00:02:42 |
|* 6 | TABLE ACCESS FULL| HF_DM_SAMPLE | 8478 | 314K| 6709 (1)| 00:01:21 |
|* 7 | TABLE ACCESS FULL| HF_DM_SAMPLE | 165K| 4677K| 6757 (2)| 00:01:22 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("S"."PROJECT_TASK_TIMES"=MAX("PROJECT_TASK_TIMES"))
5 - access("BIZ_SHEET_NO"="S"."BIZ_SHEET_NO")
6 - filter("S"."BLANK0" LIKE '%160173576%' AND "S"."BLANK0" IS NOT NULL)
7 - filter("PROJECT_ID"='000000000001')

SQL> WITH a AS (SELECT
biz_sheet_no,
project_task_times,
project_id,
blank0
FROM call.HF_DM_SAMPLE)
SELECT COUNT (1)
FROM (SELECT a.*
FROM a
INNER JOIN
( SELECT biz_sheet_no bizno, MAX (project_task_times) times
FROM a
WHERE a.project_id = '000000000001'
GROUP BY biz_sheet_no) tb_tmp
ON tb_tmp.bizno = a.biz_sheet_no
AND tb_tmp.times = a.project_task_times
AND a.blank0 LIKE '%160173576%') c__;

Plan hash value: 1949914683

---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 13469 (1)| 00:02:42 |
| 1 | SORT AGGREGATE | | 1 | | | |
| 2 | VIEW | | 1431 | | 13469 (1)| 00:02:42 |
|* 3 | FILTER | | | | | |
| 4 | HASH GROUP BY | | 1431 | 95877 | 13469 (1)| 00:02:42 |
|* 5 | HASH JOIN | | 10011 | 655K| 13468 (1)| 00:02:42 |
|* 6 | TABLE ACCESS FULL| HF_DM_SAMPLE | 8478 | 314K| 6709 (1)| 00:01:21 |
|* 7 | TABLE ACCESS FULL| HF_DM_SAMPLE | 165K| 4677K| 6757 (2)| 00:01:22 |
---------------------------------------------------------------------------------------

看出这里改写为with as临时表的sql语句后,cbo还是用的之前的执行计划的方式,而我们必须加上hint materialize的优化器才会采用临时表的方式完成查询。

SQL> WITH a AS (SELECT /*+materialize*/
2 biz_sheet_no,
3 project_task_times,
4 project_id,
5 blank0
6 FROM call.HF_DM_SAMPLE)
7 SELECT COUNT (1)
8 FROM (SELECT a.*
9 FROM a
10 INNER JOIN
11 ( SELECT biz_sheet_no bizno, MAX (project_task_times) times
12 FROM a
13 WHERE a.project_id = '000000000001'
14 GROUP BY biz_sheet_no) tb_tmp
15 ON tb_tmp.bizno = a.biz_sheet_no
16 AND tb_tmp.times = a.project_task_times
17 AND a.blank0 LIKE '%160173576%') c__;

Execution Plan
----------------------------------------------------------
Plan hash value: 2209692010

------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 425 | | 14130 (1)| 00:02:50 |
| 1 | TEMP TABLE TRANSFORMATION | | | | | | |
| 2 | LOAD AS SELECT | | | | | | |
| 3 | TABLE ACCESS FULL | HF_DM_SAMPLE | 174K| 6640K| | 6759 (2)| 00:01:22 |
| 4 | SORT AGGREGATE | | 1 | 425 | | | |
|* 5 | HASH JOIN | | 24908 | 10M| 13M| 7371 (1)| 00:01:29 |
| 6 | VIEW | | 143K| 11M| | 3590 (1)| 00:00:44 |
| 7 | HASH GROUP BY | | 143K| 13M| 18M| 3590 (1)| 00:00:44 |
|* 8 | VIEW | | 174K| 15M| | 209 (3)| 00:00:03 |
| 9 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6603_FFCC105A | 174K| 6640K| | 209 (3)| 00:00:03 |
|* 10 | VIEW | | 174K| 56M| | 209 (3)| 00:00:03 |
| 11 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6603_FFCC105A | 174K| 6640K| | 209 (3)| 00:00:03 |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

5 - access("TB_TMP"."BIZNO"="A"."BIZ_SHEET_NO" AND "TB_TMP"."TIMES"="A"."PROJECT_TASK_TIMES")
8 - filter("A"."PROJECT_ID"='000000000001')
10 - filter("A"."BLANK0" LIKE '%160173576%' AND "A"."BLANK0" IS NOT NULL)

在优化sql的时候,我们一旦看见比如重复的大表扫描、或者表连接等,要联想到是否能够物化这个结果集为一个临时表,关于with as的介绍就到这里了,有好的案例和优化文章会继续贴出来于大家分享。

声明:本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn核实处理。