Performance optimization case course: TPCH-Q6
select
sum(l_extendedprice * l_discount) as revenue
from
lineitem
where
l_shipdate >= date '1995-01-01'
and l_shipdate < date '1995-01-01' + interval '1' year
and l_discount between 0.05 - 0.01 and 0.05 + 0.01
and l_quantity < 24;
This is an aggregation operation after a single table is filtered.
1. Data storage
There is no special sorting requirement, just store lineitem in order by the primary key orderkey. So, we can continue to use lineitem.ctx from Q3. Copy it to the main directory of this query.
2. General method
Calculation code:
A |
B |
|
1 |
=now() |
|
2 |
1995-1-1 |
=elapse@y(A2,1) |
3 |
=0.05-0.01 |
=0.05+0.01 |
4 |
>quantity=24 |
|
5 |
=file("lineitem.ctx").open().cursor@m(L_EXTENDEDPRICE,L_DISCOUNT;L_SHIPDATE>=A2 && L_SHIPDATE<B2 && L_DISCOUNT>=A3 && L_DISCOUNT<=B3 && L_QUANTITY<quantity) |
|
6 |
=A5.total(sum(L_EXTENDEDPRICE*L_DISCOUNT)) |
|
7 |
=interval@ms(A1,now()) |
Note that the cursor function in A5 cannot directly contain an expression like L_EXTENDEDPRICE*L_DISCOUNT, which should be written in A6.
Test result:
Test items |
Execution time (seconds) |
General method |
6 |
3. Data conversion
Utilize the conversion method mentioned in Q1: convert date to integers.
Since the L_SHIPDATE in lineitem is already integerized in Q1, we can directly use lineitem6.ctx. Copy it to the main directory of this query.
Calculation code:
A |
B |
|
1 |
=now() |
|
2 |
1995-1-1 |
|
3 |
=days@o(A2) |
=days@o(elapse@y(A2,1)) |
4 |
=0.05-0.01 |
=0.05+0.01 |
5 |
>quantity=24 |
|
6 |
=file("lineitem6.ctx").open().cursor@m(L_EXTENDEDPRICE,L_DISCOUNT;L_SHIPDATE>=A3 && L_SHIPDATE<B3 && L_DISCOUNT>=A4 && L_DISCOUNT<=B4 && L_QUANTITY<quantity) |
|
7 |
=A6.total(sum(L_EXTENDEDPRICE*L_DISCOUNT)) |
|
8 |
=interval@ms(A1,now()) |
Test result:
Test items |
Execution time (seconds) |
General method |
6 |
Data conversion |
4 |
4. Column-wise computing
Calculation code:
A |
B |
|
1 |
=now() |
|
2 |
1995-1-1 |
|
3 |
=days@o(A2) |
=days@o(elapse@y(A2,1)) |
4 |
=0.05-0.01 |
=0.05+0.01 |
5 |
>quantity=24 |
|
6 |
=file("lineitem6.ctx").open().cursor@m(L_EXTENDEDPRICE,L_DISCOUNT;L_SHIPDATE>=A3 && L_SHIPDATE<B3 && L_DISCOUNT>=A4 && L_DISCOUNT<=B4 && L_QUANTITY<quantity) |
|
7 |
=A6.total(sum(L_EXTENDEDPRICE*L_DISCOUNT)) |
|
8 |
=interval@ms(A1,now()) |
Test result:
Test items |
Execution time (seconds) |
General method |
6 |
Data conversion |
4 |
Column-wise computing |
2 |
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL
Chinese version