Learn performance optimization skills from TPCH tests - Q10
I Query Requirement
Q10 is to query the customers with problematic freight and the losses caused by it within three months from a certain time in each country.
Q10 is characterized by multi-table join query operation with grouping, sorting and aggregation operations. Query statements do not grammatically limit how many tuples are returned, but according to the TPC-H standard, only the first 10 rows of the query results are returned(usually depending on the application).
II Oracle Execution
The query SQL written in Oracle is as follows:
select * from (
select /*+ parallel(n) */
c_custkey,c_name,
sum(l_extendedprice * (1 - l_discount)) as revenue,
c_acctbal,n_name,c_address,c_phone,c_comment
from
customer,orders,lineitem,nation
where
c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate >= date '1993-05-01'
and o_orderdate < date '1993-05-01' + interval '3' month
and l_returnflag = 'R'
and c_nationkey = n_nationkey
group by
c_custkey,
c_name,
c_acctbal,
c_phone,
n_name,
c_address,
c_comment
order by
revenue desc
) where rownum <=10;
Where /*+ parallel(n) */ is the parallel query syntax of Oracle, and n is the parallel number.
Script execution time, Unit: seconds
Number of parallel |
1 |
2 |
4 |
8 |
12 |
Oracle |
591 |
399 |
313 |
237 |
215 |
III SPL Optimization
The optimization principle of JOIN between orders and lineitem primary-sub tables here is similar to that in Q3.
The SPL script is as follows:
A |
|
1 |
=now() |
2 |
1993-5-1 |
3 |
=elapse@m(A2,3) |
4 |
=file("orders.ctx").open().cursor@m(O_ORDERKEY,O_CUSTKEY,O_ORDERDATE;O_ORDERDATE>=A2 && O_ORDERDATE<A3) |
5 |
=file("lineitem.ctx").open().news(A4,L_ORDERKEY,L_EXTENDEDPRICE,L_DISCOUNT,L_RETURNFLAG,O_CUSTKEY,O_ORDERDATE;L_RETURNFLAG=="R") |
6 |
=A5.groups@u(O_CUSTKEY:c_custkey;sum(L_EXTENDEDPRICE*(1-L_DISCOUNT)):revenue) |
7 |
=A6.top(-10;revenue) |
8 |
=file("nation.btx").import@b().keys@i(N_NATIONKEY) |
9 |
=file("customer.ctx").open() |
10 |
=A7.joinx@q(c_custkey,A9:C_CUSTKEY,C_NAME,C_ADDRESS,C_NATIONKEY,C_PHONE,C_ACCTBAL,C_COMMENT).fetch() |
11 |
=A10.switch(C_NATIONKEY,A8:N_NATIONKEY) |
12 |
=A11.new(c_custkey:C_CUSTKEY,C_NAME,revenue,C_ACCTBAL,C_NATIONKEY.N_NAME,C_ADDRESS,C_PHONE,C_COMMENT) |
13 |
=A12.sort@z(revenue) |
14 |
return interval@ms(A1,now()) |
First, get the grouping and aggregation operation on the join result of orders and lineitem tables done, then perform the related foreign key table join operation based on the intermediate result. The latter join calculation should not be done before grouping, instead, it can be limited to the required 10 records, otherwise the amount of calculation will increase.
After calculating A7, because the customer table is ordered to C_CUSTKEY, the relevant records can be quickly retrieved from the customer table by orderly matching in A7.cursor(). joinx@q, and then next JOIN can be done without traversing the customer table to reduce the amount of data reading.
Script execution time, Unit: seconds
Number of parallel |
1 |
2 |
4 |
8 |
12 |
Oracle |
591 |
399 |
313 |
237 |
215 |
SPL composite table |
108 |
61 |
36 |
23 |
21 |
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL
Chinese Version