From SQL to SPL: Add records that meet the criteria before each group after grouping
In a certain view of the PostgreSQL database, the row_index field is an underscore separated string and also a grouping field. Some groups' row_index can be divided into 3 parts, while others can be divided into 2 parts.
row_index |
id |
seq |
value_text |
1007_0_0 |
1007 |
1 |
800 |
1007_0_0 |
1007 |
2 |
1110 |
1007_0_0 |
1007 |
4 |
road, roadwork |
1007_0_0 |
1007 |
5 |
100 |
1007_0_1 |
1007 |
1 |
800 |
1007_0_1 |
1007 |
2 |
1115 |
1007_0_1 |
1007 |
4 |
road, roadwork |
1007_0_1 |
1007 |
5 |
100 |
1007_0_2 |
1007 |
1 |
800 |
1007_0_2 |
1007 |
2 |
1105 |
1007_0_2 |
1007 |
4 |
road, roadwork |
1007_0_2 |
1007 |
5 |
100 |
1007_0_3 |
1007 |
1 |
800 |
1007_0_3 |
1007 |
2 |
1120 |
1007_0_3 |
1007 |
4 |
road, roadwork |
1007_0_3 |
1007 |
5 |
100 |
1007_0 |
1007 |
6 |
Rosedale |
1007_0 |
1007 |
8 |
139 |
Now, before each group of records where row_index can be divided into 3 parts, add the group of records where row_index can be divided into 2 parts, and modify row_index to the row_index of each group.
row_index |
id |
seq |
value_text |
1007_0_0 |
1007 |
6 |
Rosedale |
1007_0_0 |
1007 |
8 |
139 |
1007_0_0 |
1007 |
1 |
800 |
1007_0_0 |
1007 |
2 |
1110 |
1007_0_0 |
1007 |
4 |
road, roadwork |
1007_0_0 |
1007 |
5 |
100 |
1007_0_1 |
1007 |
6 |
Rosedale |
1007_0_1 |
1007 |
8 |
139 |
1007_0_1 |
1007 |
1 |
800 |
1007_0_1 |
1007 |
2 |
1115 |
1007_0_1 |
1007 |
4 |
road, roadwork |
1007_0_1 |
1007 |
5 |
100 |
1007_0_2 |
1007 |
6 |
Rosedale |
1007_0_2 |
1007 |
8 |
139 |
1007_0_2 |
1007 |
1 |
800 |
1007_0_2 |
1007 |
2 |
1105 |
1007_0_2 |
1007 |
4 |
road, roadwork |
1007_0_2 |
1007 |
5 |
100 |
1007_0_3 |
1007 |
6 |
Rosedale |
1007_0_3 |
1007 |
8 |
139 |
1007_0_3 |
1007 |
1 |
800 |
1007_0_3 |
1007 |
2 |
1120 |
1007_0_3 |
1007 |
4 |
road, roadwork |
1007_0_3 |
1007 |
5 |
100 |
The records where row_index can be divided into three parts can be grouped by row_index, and then each group of records can be processed by merging the records where row_index can be divided into two parts before each group of records. But after SQL grouping, it must aggregate immediately, and subsets cannot be kept to continue processing each group of records. This requires a detour to solve, using multi-layer nested window functions to bypass this problem, which is difficult to code.
SPL supports retaining subsets after grouping, allowing for continued processing of each group of records.
1 |
=postgresql.query("select * from view1") |
2 |
=A1.select(row_index.split("_").len()==2) |
3 |
=(A1\A2).group@u(row_index) |
4 |
=A3.conj(A2.new(A3.row_index,id,seq,value_text)|~) |
A1: Query the database through JDBC.
A2: Retrieve the records where row_index can be divided into two parts.
A3: Remove A2 from the complete data, which means getting records that row_index can be divided into three parts. Group these records while keeping the order unchanged. The symbol \ represents the difference set, and group@u indicates keeping the original order after grouping.
A4: Loop each group of data: Create a new two-dimensional table according to A2, change row_index to the row_index of this group, keep other fields unchanged, and merge the new two-dimensional table with the data of this group. Finally, merge the data of each group. The symbol | represents merging set members.
Question source:https://stackoverflow.com/questions/78346354/adding-tuple-rows-to-each-subtuple-group-in-sql
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/2bkGwqTj
Youtube 👉 https://www.youtube.com/@esProc_SPL