3.4 Get difference of two sets
The difference of set A and set B is made up of members that exist in set A but not in set B. For instance, the difference of [1,2,3] and [1,3,5] is [2].
Here is an example. A business has two sales channels – online and offline stores. Their sales data is stored in SALES_ONLINE table and SALES_STORE table respectively. The task is to find customers whose total order amount in the offline store exceeds 1000 but who do not have orders in the online store in the year 2014. Below is part of data in the two tables:
SALES_ONLINE:
ID | CUSTOMERID | ORDERDATE | PRODUCTID | AMOUNT |
---|---|---|---|---|
1 | HANAR | 2014/01/01 | 17 | 3868.6 |
2 | ERNSH | 2014/01/03 | 42 | 1005.9 |
3 | LINOD | 2014/01/06 | 64 | 400.0 |
4 | OTTIK | 2014/01/07 | 10 | 1194.0 |
5 | OCEAN | 2014/01/09 | 41 | 319.2 |
… | … | … | … | … |
SALES_STORE:
ID | CUSTOMERID | ORDERDATE | SELLERID | PRODUCTID | AMOUNT |
---|---|---|---|---|---|
1 | EASTC | 2014/01/01 | 1 | 27 | 3063.0 |
2 | ERNSH | 2014/01/02 | 8 | 70 | 2713.5 |
3 | MAGAA | 2014/01/03 | 2 | 74 | 1675.0 |
4 | SAVEA | 2014/01/07 | 7 | 45 | 2018.2 |
5 | FOLIG | 2014/01/08 | 8 | 30 | 1622.4 |
… | … | … | … | … | … |
SPL can use character “\” to calculate difference of sets. A\B, for instance, calculates difference of set A and set B.
SPL script:
A | |
---|---|
1 | =T(“SalesOnline.xlsx”).select(year(ORDERDATE)==2014) |
2 | =T(“SalesStore.xlsx”).select(year(ORDERDATE)==2014) |
3 | =A2.groups(CUSTOMERID; count(~):COUNT).select(COUNT>3) |
4 | =A3.id(CUSTOMERID)\A1.id(CUSTOMERID) |
A1: Import SALES_ONLINE table from the original file and select sales records of the year 2014.
A2: Import SALES_STORE table from the original file and select sales records of the year 2014.
A3: Group A2 by CUSTOMERID and count purchase frequencies of each customer, and select records where the number of purchases are greater than 3.
A4: Use operator “\” to calculate difference of A3 and A1.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL