"Select states which more than 50 employees are based and calculate average salary of each depart .."

mars RaqForum 25 No.
245 View • 1 Years ago

6.22 Grouped subsets: filter grouped subsets and group the selected subsets

Select states which more than 50 employees are based and calculate average salary of each department in these states.

ID	NAME	BIRTHDAY	STATE	DEPT	SALARY
1	Rebecca	1974/11/20	California	R&D	7000
2	Ashley	1980/07/19	New York	Finance	11000
3	Rachel	1970/12/17	New Mexico	Sales	9000
4	Emily	1985/03/07	Texas	HR	7000
5	Ashley	1975/05/13	Texas	R&D	16000
…	…	…	…	…	…

A grouping operation itself is complex. Retaining grouped subsets and reusing them can increase computing efficiency. A.group() function retains grouped subsets after grouping so that we can further perform grouping & aggregation on the grouped subsets.

SPL script:

	A
1	=T(“Employee.csv”)
2	=A1.group(STATE).select(~.len()>50)
3	=A2.conj(~.groups(DEPT; avg(SALARY):AVG_SALARY).derive(A2.~.STATE:STATE))

A1 Import Employee table.
A2 Group A1’s table by state and get the grouped subsets containing more than 50 employees.
A3 In each subset of state, group records by department, calculate average salary in each department and concatenate all result sets.

SPL Official Website 👉 https://www.scudata.com

SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL

SPL Learning Material 👉 https://c.scudata.com

SPL Source Code and Package 👉 https://github.com/SPLWare/esProc

Discord 👉 https://discord.gg/cFTcUNs7

Youtube 👉 https://www.youtube.com/@esProc_SPL

spl-cookbook(224)

eBook

mars • 245 View • 1 Years ago