Order-based Aggregation by Intervals
【Question】
The following data table has two fields – a and b.
a |
b |
10 |
3 |
21 |
4 |
33 |
6 |
46 |
7 |
52 |
10 |
This is the result I want:
a<10 |
sum(b) |
a<20 |
sum(b) |
a<30 |
sum(b) |
a<40 |
sum(b) |
a<50 |
sum(b) |
Values of a field can be divided into over 1000 intervals.
【Answer】
To aggregate data by intervals in SQL:
with
cr as
(
select “a<10” x, b b from test3 where a < 10 union all
select “a<20” x, b b from test3 where a < 20 union all
select “a<30” x, b b from test3 where a < 30 union all
select “a<40” x, b b from test3 where a < 40 union all
select “a<50” x, b b from test3 where a < 50
)
select x,sum(b) from cr group by x;
It’s complicated when there are many conditional intervals.
SPL can divide a series of numbers into multiple intervals dynamically to generate a conditional sequence, perform aggregation by each condition, and generate the resulting table sequence. No matter the number of the intervals, the SPL solution is always a piece of serval-line code:
A |
|
1 |
=connect(“db”) |
2 |
=A1.query(“select a,b from test3 order by a”) |
3 |
=(A2.m(-1).a – 1)\10+1 |
4 |
=A3.new(~*10:C,A2.select(a<C).sum(b):D) |
A1: Connect to database db;
A2: Retrieve data from the data table and order data by a field;
A3: Find the last interval number. A2.m() function finds a certain record from A2’s table sequence with the parameter being the record’s sequence number; a negative number represents a reciprocal. Here we get the value of a field in the last record and find the interval where the largest value of a field falls. The last a value – 52– is contained in the interval [51,60], whose sequence number is 6;
A4: new() function generates a new table sequence by calculating field values of A3’s record sequence. Each parameter corresponds to a new field value; the colon introduces a new field name. As the result of A3 is 6, A3.new() loops over the records 6 times; the sign ~ represents the loop number. Values of C field are loop number*10; values of D field are sums of the results of query performed over A2’s table sequence according to the condition a<C.
A2
A3
A4
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL