"The transposition function is commonly presented in the front-end such as reports, which transpo .."

Hoo RaqForum 19 No.
1 Reply • 403 View • 2 Years ago

SPL: Dynamic Transposition

The transposition function is commonly presented in the front-end such as reports, which transposes the queried data into a specific display layout. For example, row-to-column transposition, column-to-row transposition, more complex dynamictransposition, etc. Dynamic row-to-column transposition means the fields generated during transposition can not be specified in advance, but can only be determined dynamically according to the values of the original fields. So let’s take a look at how SPL handles the dynamic transposition.

1. Automatically generate a column according to field values

[Example 1] Based on the employee table, calculate the average salary of each department in different areas. Some of the data is shown below:

ID	NAME	SURNAME	STATE	DEPT	SALARY
1	Rebecca	Moore	California	R&D	7000
2	Ashley	Wilson	New York	Finance	11000
3	Rachel	Johnson	New Mexico	Sales	9000
4	Emily	Smith	Texas	HR	7000
5	Ashley	Smith	Texas	R&D	16000
…	…	…	…	…	…

The layout of the target table is expected to be as follows:

DEPT	California	Colorado	Florida	…
Administration			9333.333	…
Finance	8000	5000	10000	…
HR	10000		7000	…
…	…	…	…	…

The row-to-column transposition needs to get its target fields from the original table. The A.pivot() function in SPL can do such a transposition, which will automatically extract names of the target fields when no target fields are specified.

The SPL script is:

	A
1	=T("Employee.csv")
2	=A1.groups(STATE,DEPT;avg(SALARY):AVG_SALARY)
3	=A2.pivot(DEPT; STATE, AVG_SALARY)

A1: Import the employee table.

A2: Group the table by STATE and DEPT to calculate the average salary of each department in each state.

A3: Use the A.pivot() function to transpose rows to columns, and retrieve the names of target fields automatically when they are not specified.

2. Dynamically generate column names through calculation

[Example 2] Based on the income details table, get the income information of each employee with categories generated automatically. Below is part of the source table:

NAME	SOURCE	INCOME
David	Salary	8000
David	Bonus	15000
Daniel	Salary	9000
Andrew	Shares	26000
Andrew	Sales	23000
…	…	…

The layout of the target table is expected to be as follows:

NAME	SOURCE1	INCOME1	SOURCE2	INCOME2	…
Andrew	Shares	26000	Sales	23000	…
Daniel	Salary	9000			…
David	Salary	8000	Bonus	15000	…
Robert	Bonus	13000			…
…	…	…	…	…	…

The target fields are not retrieved from a certain field dynamically but need to be calculated dynamically. Here the A.pivot() function is no longer available. We can generate the target data structure according to the group with the most income categories, and then fill it with data.

The SPL script is:

	A
1	=T("Income.txt").group(NAME)
2	=A1.max(~.len())
3	=create(NAME, ${A2.("SOURCE"/~/", INCOME"/~).concat@c()})
4	>A1.(A3.record(~.NAME \| ~.conj([SOURCE, INCOME])))

A1: Import the income details table, and group it by NAME.

A2: Calculate the maximum number of members in groups, i.e., the largest number of income categories.

A3: Generate column names dynamically according to the result in A3, and create an empty table.

A4: Loop through each group, and fill in the names, income sources, and amounts with the table created in A3.

3. Dynamic row-to-column transposition through table join

[Example 3] Based on the following ORDERS table, ORDER_DETAIL table, and PRODUCT table, generate a table recording information of products purchased each day by each customer in the year 2014. The relationship between ORDERS table and ORDER_DETAIL table is one to many, where each order corresponds to multiple order detail records. The relationship between ORDER_DETAIL table and PRODUCT table is many to one, where the product ID field of ORDER_DETAIL table points to the ID field of PRODUCT table. Below are parts of the three tables:

ORDERS:

ID	CUSTOMERID	EMPLOYEEID	ORDER_DATE	ARRIVAL_DATE
10248	VINET	5	2012/07/04	2012/08/01
10249	TOMSP	6	2012/07/05	2012/08/16
10250	HANAR	4	2012/07/08	2012/08/05
10251	VICTE	3	2012/07/08	2012/08/05
10252	SUPRD	4	2012/07/09	2012/08/06
…	…	…	…	…

ORDER_DETAIL:

ID	ORDER_NUMBER	PRODUCTID	PRICE	COUNT	DISCOUNT
10814	1	48	102.0	8	0.15
10814	2	48	102.0	8	0.15
10814	3	48	306.0	24	0.15
10814	4	48	102.0	8	0.15
10814	5	48	204.0	16	0.15
…	…	…	…	…	…

PRODUCT:

ID	NAME	SUPPLIERID	CATEGORY
1	Apple Juice	2	1
2	Milk	1	1
3	Tomato sauce	1	2
4	Salt	2	2
5	Sesame oil	2	2
…	…	…	…

The layout of the target table is expected to be as follows:

ORDER_DATE	CUSTOMERID	PRODUCT1	COUNT1	PRODUCT2	COUNT2	…
2014/1/5	VICTE	Corn flakes	8
2014/1/23	CONSH	Chicken	3	Cake	9
…	…	…	…	…	…	…

We can first join up the three source tables through their associative relationships and perform the subsequent calculations using the same method in the previous two examples.

The SPL script is:

	A
1	=T("Orders.txt").select(year(ORDER_DATE)==2014)
2	=T("Product.txt")
3	=T("OrderDetail.txt").switch(PRODUCTID,A2:ID).group(ID)
4	=join(A1:ORDERS,ID;A3:DETAIL,ID)
5	=create(DATE,CUSTOMERID,${A4.max(DETAIL.len()).("PRODUCT"/~/","/"COUNT"/~).concat@c()})
6	>A4.run(A5.record([ORDERS.ORDER_DATE,ORDERS.CUSTOMERID]\|DETAIL.([PRODUCTID.NAME,COUNT]).conj()))

A1: Import the Orders table and select records of the year 2014.

A2: Import the Product table.

A3: Import the OrderDetail table, join it with the Product table through the PRODUCTID filed, convert the PRODUCTID filed values to corresponding product records, and then group the joining result by order ID.

A4: Join up Order table with OrderDetail table through order ID.

A5: Generate the target data structure according to the group with the most OrderDetail records, and create an empty table.

A6: Loop through the order data and fill them in the table created in A5 in turn.

In fact, this case is the same as the previous ones essentially with only one more step, that is, joining up the three tables through their associative relationships. The subsequent operations are the same, creating the target data structure as needed and filling it with data.

SPL Official Website 👉 https://www.scudata.com

SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL

SPL Learning Material 👉 https://c.scudata.com

SPL Source Code and Package 👉 https://github.com/SPLWare/esProc

Discord 👉 https://discord.gg/2bkGwqTj

Youtube 👉 https://www.youtube.com/@esProc_SPL

esProc