Learn performance optimization skills from TPCH tests - Q11
I Query Requirement
Q11 queries the value of parts supplied by a certain country in the inventory.
Q11 is characterized by multi-table join query operation with grouping, sorting, aggregation and sub-query operation. Subqueries are located in the HAVING condition of grouping operations.
II Oracle Execution
The query SQL written in Oracle is as follows:
select /*+ parallel(n) */
ps_partkey,
sum(ps_supplycost * ps_availqty) as value
from
partsupp,
supplier,
nation
where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'CHINA'
group by
ps_partkey
having
sum(ps_supplycost * ps_availqty) > (
select
sum(ps_supplycost * ps_availqty) * 0.000001
from
partsupp,
supplier,
nation
where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'CHINA'
)
order by
value desc;
Where /*+ parallel(n) */ is the parallel query syntax of Oracle, and n is the parallel number.
Script execution time, Unit: seconds
Number of parallel |
1 |
2 |
4 |
8 |
12 |
Oracle |
71 |
52 |
43 |
36 |
33 |
III SPL Optimization
Let's regard the following sub-query as view V:
select
ps_partkey,
sum(ps_supplycost * ps_availqty) as value
from
partsupp,
supplier,
nation
where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'CHINA'
Then the original primary query equals to
select
ps_partkey,
value
from V
where value>0.000001*(select sum(value) from V)
This V is already a grouped result with a relative small amount, so the calculation of traversing V is much less than that of traversing partsuppand directly.
And we also know that the partsupp table is ordered by the primary key ps_partkey and ps_suppkey, that is, ordered to ps_partkey. The ordered grouping method can be used here to group by this field to improve the performance of computing V.
The SPL script is as follows:
A |
|
1 |
=now() |
2 |
>name="CHINA" |
3 |
>percent=0.000001 |
4 |
=file("nation.btx").import@b().select(N_NAME== name).derive@o().keys@i(N_NATIONKEY) |
5 |
=file("supplier.ctx").open().cursor@m(S_SUPPKEY;A4.find(S_NATIONKEY)).fetch().keys@i(S_SUPPKEY) |
6 |
=file("partsupp.ctx").open().cursor@m(PS_PARTKEY,PS_AVAILQTY,PS_SUPPLYCOST;A5.find(PS_SUPPKEY)) |
7 |
=A6.groups@o(PS_PARTKEY;sum(PS_SUPPLYCOST*PS_AVAILQTY):value) |
8 |
=A7.sum(value)*percent |
9 |
=A7.select(value>A8).sort@z(value) |
10 |
return interval@ms(A1,now()) |
A7 performs ordered grouping with groups@o, which is equivalent to calculating view V. Then A8 and A9 traverse A7 twice to calculate the result.
Script execution time, Unit: seconds
Number of parallel |
1 |
2 |
4 |
8 |
12 |
Oracle |
71 |
52 |
43 |
36 |
33 |
SPL composite table |
24 |
15 |
9 |
6 |
5 |
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/2bkGwqTj
Youtube 👉 https://www.youtube.com/@esProc_SPL
Chinese Version