Learn performance optimization skills from TPCH tests - Q19
I. Query Requirement
Q19 queries the total discounted revenue for three different types of parts that are shipped by air or delivered in person. Parts are selected based on specific brands, containers and size range.
Q19 is characterized by three table join operations with aggregation and IN sub-query.
II. Oracle Execution
The query SQL written in Oracle is as follows:
select /*+ parallel(n) */
sum(l_extendedprice * (1 - l_discount)) as revenue
from
lineitem,part
where
(
p_partkey = l_partkey
and p_brand = 'Brand#32'
and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
and l_quantity >= 7 and l_quantity <= 7 + 10
and p_size between 1 and 5
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
)
or
(
p_partkey = l_partkey
and p_brand = 'Brand#23'
and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
and l_quantity >= 18 and l_quantity <= 18 + 10
and p_size between 1 and 10
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
)
or
(
p_partkey = l_partkey
and p_brand = 'Brand#45'
and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
and l_quantity >= 22 and l_quantity <= 22 + 10
and p_size between 1 and 15
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
);
Where /*+ parallel(n) */ is the parallel query syntax of Oracle, and n is the parallel number.
Script execution time, Unit: seconds
Number of parallel |
1 |
2 |
4 |
8 |
12 |
Oracle |
395 |
277 |
204 |
176 |
154 |
III. SPL Optimization
This query is a filtering query on two joined tables, which is simple in structure, but along with complex filtering conditions. We have noticed that many items in conditional expressions are only related to the smaller table PART, and the computational complexity of these items is relatively high (with IN operation). If these items are calculated based on the PART table in advance, the computational complexity will be much less than that based on the join result of two tables, because the join result is the same size as the lineitem table, which is much larger.
The SPL script is as follows:
A |
|
1 |
=now() |
2 |
>brand1="Brand#32" |
3 |
>brand2="Brand#23" |
4 |
>brand3="Brand#45" |
5 |
>quantity1=7 |
6 |
>quantity2=18 |
7 |
>quantity3=22 |
8 |
=["SM CASE", "SM BOX", "SM PACK", "SM PKG"] |
9 |
=["MED BAG", "MED BOX", "MED PKG", "MED PACK"] |
10 |
=["LG CASE", "LG BOX", "LG PACK", "LG PKG"] |
11 |
=["AIR","AIR REG"] |
12 |
=file("part.ctx").open().cursor@m(P_PARTKEY,P_BRAND,P_SIZE,P_CONTAINER,0:FLAG;P_SIZE>=1) |
13 |
=A12.run(FLAG=if(P_BRAND==brand1 && A8.contain(P_CONTAINER) && P_SIZE<=5:1, P_BRAND==brand2 && A9.contain(P_CONTAINER) && P_SIZE<=10:2, P_BRAND==brand3 && A10.contain(P_CONTAINER) && P_SIZE<=15:3; 0)).select(FLAG>0).fetch().derive@o().keys@i(P_PARTKEY) |
14 |
=file("lineitem.ctx").open().cursor@m(L_PARTKEY,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT;L_PARTKEY:A13,L_SHIPINSTRUCT=="DELIVER IN PERSON" && A11.contain(L_SHIPMODE) && L_QUANTITY>= quantity1 && L_QUANTITY<=quantity3+10) |
15 |
=A14.select(case(L_PARTKEY.FLAG,1:L_QUANTITY<=quantity1+10,2:L_QUANTITY>=quantity2 && L_QUANTITY<=quantity2+10;L_QUANTITY>=quantity3)) |
16 |
=A15.total(sum(L_EXTENDEDPRICE*(1-L_DISCOUNT))) |
17 |
return interval@ms(A1,now()) |
A13 generates a new FLAG field in the PART table, which is used to calculate the complex conditions related to PART table. When the records satisfy the OR condition 1, 2 and 3 in SQL, assign them as 1, 2 and 3 respectively; if none of them satisfies, assign it as 0, and then filter them based on whether the FLAG value is larger than 0. In A15, the value of FLAG is again used to select which quantity value to filter L_QUANTITY. FLAG-related complex formulas only need to calculate the row number of the PART table.
Script execution time, Unit: seconds
Number of parallel |
1 |
2 |
4 |
8 |
12 |
Oracle |
395 |
277 |
204 |
176 |
154 |
SPL composite table |
158 |
80 |
40 |
21 |
15 |
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL