3.2 Non-equivalence grouping
3.2.1 Alignment grouping
Count the number of male employees in the specified departments [‘Administration’, ‘HR’, ‘Marketing’, ‘Sales’] in the employee table.
SPL
A | |
---|---|
1 | =file(“EMPLOYEE.csv”).import@tc() |
2 | =A1.select(GENDER==“M”) |
3 | [Administration, HR, Marketing, Sales] |
4 | =A2.align@a(A3,DEPT).new(A3(#):DEPT,~.len():NUM) |
SPL provides the align@a() function to implement alignment grouping. This function can produce the same result as group function, and both return the grouped subset.
SQL
SELECT d.DEPT, COUNT(GENDER) AS MaleCount
FROM (
SELECT COLUMN_VALUE AS DEPT
FROM TABLE(SYS.ODCIVARCHAR2LIST('Administration', 'HR', 'Marketing', 'Sales'))) d
LEFT JOIN EMPLOYEE e ON d.DEPT = e.DEPT AND e.GENDER = 'M'
GROUP BY d.DEPT;
SQL can only use left join to do this task.
Python
Python does not provide a dedicated method similar to alignment grouping, so it has to hard code.
3.2.2 Enumeration grouping
Calculate the average salary of employees of different generations (generation X: born between 1965 and 1980, generation Y: born between 1980 and 1995, generation Z: born btw 1995 and 2010).
SPL
A | B | |
---|---|---|
1 | =file(“EMPLOYEE.csv”).import@tc() | |
2 | [?>=date(“1965-01-01”)&&?<date(“1980-01-01”),?>=date(“1980-01-01”)&&?<date(“1995-01-01”),?>=date(“1995-01-01”)&&?<date(“2010-01-01”)] | /Grouping conditions |
3 | [X,Y,Z] | |
4 | =A1.run(BIRTHDAY=date(BIRTHDAY)) | |
5 | =A1.enum(A2,BIRTHDAY).new(A3(#):GENERATION,~.avg(SALARY):SALARY) | /Enumeration grouping |
6 | [1965-01-01,1980-01-01,1995-01-01,2010-01-01] | |
7 | =A1.align@a(A3,A3(A6.pseg(BIRTHDAY))).new(A3(#):GENERATION,~.avg(SALARY):SALARY) | /Use the alignment grouping function to do enumeration grouping |
A2: Write the conditions of different generations as string to form a sequence, and use ‘?’ to represent the grouping key value to be passed to calculation.
A5: The enum() function uses the grouping key of each member in the set to be grouped to calculate these conditions in turn. If the result is true, the member is put into the corresponding group.
A6: Put the segmentation years of the generations in a sequence.
A7: Use the align()and pseg() methods to do the enumeration grouping task.
SQL
WITH t AS (SELECT 'X' AS generation, DATE '1965-01-01' AS s,DATE '1980-01-01' AS e FROM DUAL
UNION ALL
SELECT 'Y' AS generation, DATE '1980-01-01' AS s,DATE '1995-01-01' AS e FROM DUAL
UNION ALL
SELECT 'Z' AS generation, DATE '1995-01-01' AS s,DATE '2010-01-01' AS e FROM DUAL)
SELECT generation,AVG(SALARY) FROM t LEFT JOIN EMPLOYEE e on e.BIRTHDAY>=t.s AND e.BIRTHDAY<t.e GROUP BY generation;
Python
Python does not provide a dedicated method similar to enumeration grouping, so it has to hard code.
3.2.3 Overlapped enumeration grouping
To group the employees by age and calculate the number of each group (when grouping conditions overlap, list all the employees satisfying the conditions. The grouping conditions are [year of service<5, 5<=years of service>=10, years of service>=10, years of service>=15]).
SPL
A | |
---|---|
1 | =file(“EMPLOYEE.csv”).import@tc() |
2 | [?<5,?>=5 && ?<10,?>=10,?>=15] |
3 | [Within five years,Five to ten years,More than ten years,Over fifteen years] |
4 | =A1.derive(age(HIREDATE):EMPLOYED) |
5 | =A4.enum@r(A2, EMPLOYED).new(A3(#):EMPLOYEED,~.len():NUM) |
The @r option in the enum@r() function allows members repeatedly existing in different subsets.
SQL
WITH t AS (SELECT 'Within five years' AS employeed, 0 AS s,5 AS e FROM DUAL
UNION ALL
SELECT 'Five to ten years' AS employeed, 5 AS s,10 AS e FROM DUAL
UNION ALL
SELECT 'More than ten years' AS employeed, 10 AS s,100 AS e FROM DUAL
UNION ALL
SELECT 'Over fifteen years' AS employeed, 15 AS s,100 AS e FROM DUAL)
SELECT employeed,COUNT(h.HIRED) AS NUM
FROM t
LEFT JOIN (
SELECT TRUNC(MONTHS_BETWEEN(SYSDATE, HIREDATE)/12) AS HIRED
FROM EMPLOYEE) h
ON h.HIRED>=t.s AND h.HIRED <t.e
GROUP BY employeed;
Python
Python can only implement this task through hard coding.
3.3 Ordered grouping
Example codes for comparing SPL, SQL, and Python
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/2bkGwqTj
Youtube 👉 https://www.youtube.com/@esProc_SPL