3.31 Union two large tables through MERGE
Union-merge two large data tables of same structure for further query. The two tables have common records.
Both database db1 and database db2 contain sales record tables Sales of same structure. Each table is too large to be loaded into the memory at once. The task is to calculate each customer’s order amount in the year 2014.
OrderID | Customer | SellerId | OrderDate | Amount |
---|---|---|---|---|
10400 | EASTC | 1 | 2014/01/01 | 3063.0 |
10401 | HANAR | 1 | 2014/01/01 | 3868.6 |
10402 | ERNSH | 8 | 2014/01/02 | 2713.5 |
10403 | ERNSH | 4 | 2014/01/03 | 1005.9 |
10404 | MAGAA | 2 | 2014/01/03 | 1675.0 |
… | … | … | … | … |
SPL script:
A | |
---|---|
1 | =connect(“db1”).cursor@x(“select * from Sales where year(OrderDate)=2014 order by OrderID”) |
2 | =connect(“db2”).cursor@x(“select * from Sales where year(OrderDate)=2014 order by OrderID”) |
3 | =[A1,A2].mergex@u(OrderID) |
4 | =A3.groups(Customer; sum(Amount):Amount) |
A1 Retrieve Sales records of 2014 from db1 and sort them by OrderID.
A2 Retrieve Sales records of 2014 from db2 and sort them by OrderID.
A3 Use mergex function to union-merge records of the two cursors by OrderID, during which @u option enables removing duplicate records.
A4 Use groups function to group records and calculate order amount of each customer.
Execution result:
Customer | Amount |
---|---|
ANATR | 1129.75 |
ANTON | 6452.15 |
… | … |
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL