Performance optimization skill: Ordered MERGE
I Problem introduction & solving
Relational databases use segmented HASH approach to associate tables. Suppose the sizes (number of records) of the two tables to be joined are N and M respectively, then the computational complexity (i.e. the comparison number of join field) of HASH segmentation method is about SUM(Ni*Mi), in which Ni and Mi are respectively the number of records of the two tables with HASH values as i, satisfying expressions N=SUM(Ni) and M=SUM(Mi). The value of SUM(Ni*Mi) is probably much smaller than the complexity N*M of full traversal (sometimes even K, the range of HASH values, times smaller).
If both tables are ordered by join keys, we can use MERGE algorithm to implement the association, whose complexity is N+M in this case. If both N and M are relatively large (generally much larger than K), N+M will be far smaller than SUM(Ni*Mi). That is, the MERGE algorithm is a lot faster than segmented HASH approach.
In real-world application, homo-dimension tables and primary-sub tables are always joined by the primary key or part of the primary key. So if we sort the tables by their primary keys in advance, we can always associate them using the efficient MERGE algorithm which is exactly the approach adopted in SPL.
In the following part we’ll test how fast the ordered merge association in esProc SPL is by comparing it with the association in Oracle relational databases.
II Test environment
1. All-in-memory small data calculation
The server for testing has two Intel2670 CPUs, 2.6G frequency, 16-core in total, 128G memory and an SSD hard disk.
A total of 50G data is generated according to TPCH standard. The primary table is orders and the sub table is orderdetail (generated by decreasing records of the lineitem table). The records in the two tables are sorted by O_ORDERKEY and L_ORDERKEY respectively in ascending order.
The tests of Oracle and SPL are performed in single thread.
2. External storage big data calculation
Here the virtual machine of the aforementioned test server is used with 16G memory and an SSD hard disk.
A total of 200G data is generated according to TPCH standard. The primary table is orders and the sub table is lineitem. The records in the two tables are sorted respectively by O_ORDERKEY and L_ORDERKEY in ascending order.
Due to the relatively large amount of data, both Oracle and SPL tests are performed in 8-thread parallel.
III All-in-memory small data calculation test
1. Oracle
(1) Non-association query test
SQL statements:
select
l_year,
sum(volume) as revenu,
sum(l_quantity) as quantity
from
(
select
extract(year from l_shipdate) as l_year,
(l_extendedprice * (1 - l_discount) ) as volume,
l_quantity
from
orderdetail
)
group by
l_year
union all
select
2019 as l_year,
count(o_orderkey) as revenu,
count(o_totalprice) as quantity
from
orders;
(2) Association query test
SQL statements:
select
l_year,
sum(volume) as revenu,
sum(l_quantity) as quantity
from
(
select
extract(year from l_shipdate) as l_year,
(l_extendedprice * (1 - l_discount) ) as volume,
l_quantity
from
orders,
orderdetail
where
o_orderkey = l_orderkey
and l_quantity>0
)
group by
l_year;
The query condition l_quantity>0 is always true, that is, no data is filtered to ensure that this column of data will be read.
2. SPL
(1) Non-association query test
SPL script:
A |
|
1 |
>orders=file("/home/ctx/orders.ctx").open().memory() |
2 |
>orderdetail=file("/home/ctx/orderdetail.ctx").open().memory() |
3 |
=now() |
4 |
=orderdetail.cursor(L_ORDERKEY,L_EXTENDEDPRICE,L_DISCOUNT,L_SHIPDATE,L_QUANTITY).groups(year(L_SHIPDATE):l_year; sum(L_EXTENDEDPRICE*(1-L_DISCOUNT)):revenue,sum(L_QUANTITY):quantity) |
5 |
=orders.groups(;count(O_ORDERKEY),count(O_TOTALPRICE)) |
6 |
=interval@s(A3,now()) |
(2) Association query test
SPL script:
A |
|
1 |
>orders=file("/home/ctx/orders.ctx").open().memory() |
2 |
>orderdetail=file("/home/ctx/orderdetail.ctx").open().memory() |
3 |
=now() |
4 |
=orders.cursor(O_ORDERKEY,O_TOTALPRICE;O_TOTALPRICE>0) |
5 |
=orderdetail.cursor(L_ORDERKEY,L_EXTENDEDPRICE,L_DISCOUNT,L_SHIPDATE,L_QUANTITY) |
6 |
=joinx(A5:detail,L_ORDERKEY;A4:orders,O_ORDERKEY) |
7 |
=A6.groups(year(detail.L_SHIPDATE):l_year;sum(detail.L_EXTENDEDPRICE*(1-detail.L_DISCOUNT)):revenue, sum(detail.L_QUANTITY):quantity) |
8 |
=interval@s(A3,now()) |
The joinx in A6 is the ordered merge association function that requires the join fields to be sorted in ascending order.
3. Test results & analysis
Here are test results (Unit: second):
Language |
Non-association |
Association |
Decrease (times) |
Association time |
Oracle |
16 |
67 |
4.2 |
51 |
SPL |
14 |
32 |
2.3 |
18 |
Each result is the best-performing among multiple executions with data fully cached.
To phrase the above two SQL statements: in the non-association query test, we retrieve O_ORDERKEY field and O_TOTALPRICE field from orders table and count the records; we retrieve L_ORDERKEY, L_EXTENDEDPRICE, L_DISCOUNT and L_SHIPDATE fields from orderdetail table and sum the sales prices and L_DISCOUNT. In the association query test, we retrieve equal amount of data from the two tables and sum the sales prices and O_TOTALPRICE over the associated result set. The retrieval and calculation amount in both cases are basically the same, so the difference between them lies in the time spent in performing the association. And the same applies to SPL.
With the same hardware equipment and data size, it takes SPL 18 seconds and Oracle 51 seconds to perform the association. The Java-based SPL is nearly 3 times faster than C++-driven Oracle, which confirms that the ordered merge can greatly speed up association. On the other hand, SPL association query is 2.3 times slower and Oracle association query is 4.2 slower compared with the non-association queries, which means the ordered merge computation is much faster than the HASH association.
IV External memory big data calculation test
When both tables to be joined can’t fit into memory, relational databases still use segmented HASH approach to implement the association. The approach divides records in each table into multiple segments according to the HASH values of the join field, loads each segment in memory and uses the segmented HASH approach again to perform the association. But this approach leads to the problem of external storage swapping, data need to be written out and then read in by segments. Reading on the external storage is not fast, writing is even slower; with one extra writing and reading, the performance of this approach is doomed to be much worse.
Ordered merge association only needs one traversal of each table, which decreases not only the CPU computation but also the external storage IO amount considerably. The algorithm requires very little memory space to cache a small number of records for each table with almost no impact on the memory requirements of other concurrent tasks.
1.Oracle
(1) Non-association query test
The test SQL statements are similar to those of small data calculation. We just need to change orderdetail table to lineitem table and add “/*+ parallel(8) */” after the first “select” to enable an 8-thread parallel processing.
(2) Association query test
The test SQL statements are similar to those of small data calculation. We just need to change orderdetail table to lineitem table and add “/*+ parallel(8) */” after the first “select” to enable an 8-thread parallel processing.
2. SPL
(2) Non-association query test
SPL script:
A |
|
1 |
=now() |
2 |
=file("/home/ctx/lineitem.ctx").open().cursor@m(L_ORDERKEY,L_EXTENDEDPRICE,L_DISCOUNT,L_SHIPDATE,L_QUANTITY;;8) |
3 |
=A2.groups(year(L_SHIPDATE):l_year; sum(L_EXTENDEDPRICE*(1-L_DISCOUNT)):revenue,sum(L_QUANTITY):quantity) |
4 |
=file("/home/ctx/orders.ctx").open().cursor@m(O_ORDERKEY,O_TOTALPRICE;;8) |
5 |
=A4.total(count(O_ORDERKEY),count(O_TOTALPRICE)) |
6 |
=interval@s(A1,now()) |
Both A2 and A4 retrieve data with an 8-thread multi-cursor.
(2) Association query test
SPL script:
A |
|
1 |
=now() |
2 |
=file("/home/ctx/orders.ctx").open().cursor@m(O_ORDERKEY,O_TOTALPRICE;O_TOTALPRICE>0;8) |
3 |
=file("/home/ctx/lineitem.ctx").open().cursor(L_ORDERKEY,L_EXTENDEDPRICE,L_DISCOUNT,L_SHIPDATE,L_QUANTITY;;A2) |
4 |
=joinx(A3:detail,L_ORDERKEY;A2:orders,O_ORDERKEY) |
5 |
=A4.groups(year(detail.L_SHIPDATE):l_year; sum(detail.L_EXTENDEDPRICE*(1-detail.L_DISCOUNT)):revenue,sum(detail.L_QUANTITY):quantity) |
6 |
=interval@s(A1,now()) |
A3 uses A2 as a parameter when creating a cursor to retrieve data from lineitem table by segments according to the segmented primary key of the primary orders table.
The joinx function, i.e. the ordered merge association function, in A4 requires the join fields to be sorted in ascending order.
3. Test results & analysis
Here are test results (Unit: second):
Language |
Non-association |
Association |
Decrease (times) |
Association time |
Oracle |
265 |
863 |
3.3 |
598 |
SPL |
70 |
101 |
1.4 |
31 |
The computation amount analysis and association time calculation are the same as those for the small data calculation.
With the same hardware equipment and data size, it takes SPL 31 seconds and Oracle 598 seconds to perform the association. SPL is over 19 times faster than the latter, which further confirms that the ordered merge can greatly improve the association performance. And the efficiency is highly increased compared to that in the small data calculation, meaning that the larger the data size becomes, the greater the impact of the ordered merge algorithm on the performance.
The same conclusion can also be drawn from the decrease times of the association query. But compared with the small data calculation test where data is fully cached in memory, the external storage big data calculation spends more time in retrieving data, which makes the non-association query time become longer and explains why the decrease times become even smaller compared to small data test.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL
Chinese version