7.18 Merge-join tables of same order – big data
Merge-join multiple associated ordered tables, including at least one large table, and perform filtering.
Find customers whose total order amount exceeds 10,000 based on the associated Order table, Detail table and Customer table. Both Order table and Detail table contain a large amount of data that cannot wholly loaded into memory.
The joinx() function is used to perform order-based merge.
SPL script:
A | |
---|---|
1 | =connect(“db”) |
2 | =A1.cursor(“select * from Order order by ID”) |
3 | =A1.cursor(“select * from Detail order by ID”) |
4 | =A1.query@x(“select * from Customer”) |
5 | =A2.switch@i(CustomerID,A4:ID) |
6 | =joinx(A5:Order,ID;A3:Detail,ID) |
7 | =A6.groups(Order.CustomerID.Name; sum(Detail.Amount):Amount).select(Amount>10000) |
A1 Connect to the database.
A2 Create cursor on Order table.
A3 Create cursor on Detail table.
A4 Query Customer table.
A5 Use switch@i function to convert Order table’s CustomerID field values to matching records of Customer table while deleting records that do not have matches.
A6 Use joinx() function to perform order-based merge on Order table’s cursor and Detail table’s cursor.
A7 Group merged records in A6 by customer and get records where the order amount exceeds 10,000.
Execution result:
Name | Amount |
---|---|
ALFKI | 14848.0 |
AROUT | 55492.0 |
… | … |
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/2bkGwqTj
Youtube 👉 https://www.youtube.com/@esProc_SPL