3.3 Get union of two sets
The union of set A and set B is made up of members that are in set A or in set B. For instance, the union of [1,2] and [1,3] is [1,2,3]. Note that their common member 1 only appears once in the union.
Here is an example. A business has two sales channels – online and offline stores. Their sales data is stored in SALES_ONLINE table and SALES_STORE table respectively. The task is to get products whose online cumulative sales amount is above 10000 or that are purchased more than 5 times offline in the year 2014. Below is part of data in the two tables:
SALES_ONLINE:
ID | CUSTOMERID | ORDERDATE | PRODUCTID | AMOUNT |
---|---|---|---|---|
1 | HANAR | 2014/01/01 | 17 | 3868.6 |
2 | ERNSH | 2014/01/03 | 42 | 1005.9 |
3 | LINOD | 2014/01/06 | 64 | 400.0 |
4 | OTTIK | 2014/01/07 | 10 | 1194.0 |
5 | OCEAN | 2014/01/09 | 41 | 319.2 |
… | … | … | … | … |
SALES_STORE:
ID | CUSTOMERID | ORDERDATE | SELLERID | PRODUCTID | AMOUNT |
---|---|---|---|---|---|
1 | EASTC | 2014/01/01 | 1 | 27 | 3063.0 |
2 | ERNSH | 2014/01/02 | 8 | 70 | 2713.5 |
3 | MAGAA | 2014/01/03 | 2 | 74 | 1675.0 |
4 | SAVEA | 2014/01/07 | 7 | 45 | 2018.2 |
5 | FOLIG | 2014/01/08 | 8 | 30 | 1622.4 |
… | … | … | … | … | … |
SPL can use character “&” to calculate union of sets. A&B, for instance, calculates union of set A and set B.
SPL script:
A | |
---|---|
1 | =T(“SalesOnline.xlsx”).select(year(ORDERDATE)==2014) |
2 | =T(“SalesStore.xlsx”).select(year(ORDERDATE)==2014) |
3 | =A1.groups(PRODUCTID; sum(AMOUNT):AMOUNT).select(AMOUNT>10000) |
4 | =A2.groups(PRODUCTID; count(~):COUNT).select(COUNT>5) |
5 | =A3.(PRODUCTID)&A4.(PRODUCTID) |
A1: Import SALES_ONLINE table from the original file and select sales records of the year 2014.
A2: Import SALES_STORE table from the original file and select sales records of the year 2014.
A3: Group A1 by PRORDUCTID and calculate the total sales amount for each product in the online store, and select records where the total amount exceeds 10000.
A4: Group A2 by PRORDUCTID and calculate the total sales amount for each product in the offline store, and select records where the product is purchased more than 5 times.
A5: Use operator “&” to calculate the union of A3 and A4.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL