Performance Optimization Exercises Using TPC-H – Q14
Ⅰ SQL code and analysis
Below is the SQL query statement:
select
100.00 * sum(
case when p_type like 'PROMO%' then l_extendedprice * (1 - l_discount)
else 0 end)
/ sum(l_extendedprice * (1 - l_discount) ) as promo_revenue
from
lineitem,
part
where
l_partkey = p_partkey
and l_shipdate >= date '1995-04-01'
and l_shipdate < date '1995-04-01' + interval '1' month;
This is an aggregate operation on the filtered result set of two-table association.
Ⅱ SPL solution
This is a regular association-based sum query. We can handle it by making full use of the parallel processing.
A |
|
1 |
=now() |
2 |
1995-4-1 |
3 |
=elapse@m(A2,1) |
4 |
=file("part.ctx").open().cursor@m(P_PARTKEY,P_TYPE).fetch().keys@i(P_PARTKEY) |
5 |
=file("lineitem.ctx").open().cursor@m(L_PARTKEY,L_EXTENDEDPRICE,L_DISCOUNT;L_SHIPDATE>=A2 &&L_SHIPDATE<A3,L_PARTKEY:A4) |
6 |
=A5.run(L_EXTENDEDPRICE*=(1-L_DISCOUNT),L_DISCOUNT=if(pos@h(L_PARTKEY.P_TYPE,"PROMO"),L_EXTENDEDPRICE,0)) |
7 |
=A6.total(sum(L_DISCOUNT),sum(L_EXTENDEDPRICE)) |
8 |
=100.00*A7(1)/A7(2) |
9 |
=interval@ms(A1,now()) |
Ⅲ Further optimization
1. Optimization method
In this example, we will use date-integer conversion optimization method explained in Q1, where lineitem table’s L_SHIPDATE field has been converted, and dimension table primary key numberization method explained in Q2 – lineitem’s L_PARTKEY field has been converted in the previous example. The part table’s P_PARTKEY has been converted and its P_TYPE field is also converted to the integer type, but the latter is not needed in this example. So, we re-generate composite table part here.
2. Code for data conversion
2.1 Conversion on part table
A |
|
1 |
=file("part.ctx").open().cursor().fetch() |
2 |
=A1.run(P_PARTKEY=#) |
3 |
=file("part_14.ctx").create(#P_PARTKEY, P_NAME,P_MFGR, P_BRAND, P_TYPE, P_SIZE, P_CONTAINER, P_RETAILPRICE, P_COMMENT) |
4 |
>A3.append(A2.cursor()) |
2.2 Conversion on lineitem table
Copy lineitem_13.ctx and rename it lineitem_14.ctx.
3. Code after data conversion
First, we need to preload the dimension table. Below is preloading code:
A |
|
1 |
>env(part, file("part_14.ctx").open().import()) |
Before performing the query, we need to first run the preloading code to load the small dimension table into memory.
Computing code:
A |
|
1 |
=now() |
2 |
1995-4-1 |
3 |
=days@o(elapse@m(A2,1)) |
4 |
=days@o(A2) |
5 |
=part.@m(pos@h(P_TYPE,"PROMO")) |
6 |
=file("lineitem_14.ctx").open().cursor@m(L_PARTKEY,L_EXTENDEDPRICE,L_DISCOUNT;L_SHIPDATE>=A4 && L_SHIPDATE<A3) |
7 |
=A6.run(L_EXTENDEDPRICE*=(1-L_DISCOUNT),L_DISCOUNT=if(A5(L_PARTKEY),L_EXTENDEDPRICE,0)) |
8 |
=A7.total(sum(L_DISCOUNT),sum(L_EXTENDEDPRICE)) |
9 |
=100.00*A8(1)/A8(2) |
10 |
=interval@ms(A1,now()) |
Ⅳ Using enterprise edition’s column-wise computation
1. Original data
A |
|
1 |
=now() |
2 |
1995-4-1 |
3 |
=elapse@m(A2,1) |
4 |
=file("part.ctx").open().cursor@mv(P_PARTKEY,P_TYPE).fetch().keys@i(P_PARTKEY) |
5 |
=file("lineitem.ctx").open().cursor@mv(L_PARTKEY,L_EXTENDEDPRICE,L_DISCOUNT;L_SHIPDATE>=A2 && L_SHIPDATE<A3).join(L_PARTKEY,A4,P_TYPE) |
6 |
=A5.derive@o(L_EXTENDEDPRICE*(1-L_DISCOUNT):dp,if(pos@h(P_TYPE,"PROMO"),dp,0.0):dp1) |
7 |
=A6.total(sum(dp1),sum(dp)) |
8 |
=100.00*A7(1)/A7(2) |
9 |
=interval@ms(A1,now()) |
2. Optimized data
First, we need to preload the dimension table. Below is preloading code:
A |
|
1 |
>env(part, file("part_14.ctx").open().import@v()) |
Before performing the query, we need to first run the preloading code to load the small dimension table into memory.
Computing code
A |
|
1 |
=now() |
2 |
1995-4-1 |
3 |
=days@o(elapse@m(A2,1)) |
4 |
=days@o(A2) |
5 |
=part.(pos@h(p_type(P_TYPE),"PROMO")) |
6 |
=file("lineitem_14.ctx").open().cursor@mv(L_PARTKEY,L_EXTENDEDPRICE,L_DISCOUNT;L_SHIPDATE>=A4 && L_SHIPDATE<A3) |
7 |
=A6.derive@o(L_EXTENDEDPRICE*(1-L_DISCOUNT):dp,if(A5(L_PARTKEY),dp,0):dp1) |
8 |
=A7.total(sum(dp1),sum(dp)) |
9 |
=100.00*A8(1)/A8(2) |
10 |
=interval@ms(A1,now()) |
Ⅴ Test result
Unit: Second
Regular |
Column-wise |
|
Before optimization |
14.2 |
6.3 |
After optimization |
6.6 |
2.8 |
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/2bkGwqTj
Youtube 👉 https://www.youtube.com/@esProc_SPL