SPL Programming - 5.3 [Sequence as a whole] Loop functions: advanced
SPL provides ~ symbol as loop variable in loop functions, which can simplify a lot of code originally written with loop statements. But we know that the loop statement for sequence also provides the symbol to obtain the loop number (add # before the loop variable), so is there something similar in the loop function?
Yes, simply use the # symbol.
A | |
---|---|
1 | =[3,4,3,6,1,4] |
2 | =A1.sum(if(#%2==1,1,-1)*~) |
A2 calculates the difference between the sum of odd and even members of sequence A1.
Using #, we can calculate the sequence by contraposition addition of members of two sequences with the same length.
A | B | |
---|---|---|
1 | =[3,8,2,4,7] | =[9,2,6,1,0] |
2 | =A1.(~+B1(#)) |
A2 will calculate [A1(1)+B1(1),…,A1(5)+B1(5)].
Sequence A stores the sales of a company from January to December in turn. Now we want to know the maximum monthly growth, that is, the maximum difference between a member and the previous member. This can be realized with #.
A | |
---|---|
1 | =[123,345,321,345,546,542,874,234,543,983,434,897] |
2 | =A1.(if(#>1,~-A1(#-1),0)).max() |
When #>1, that is, not January, we can calculate the growth of this month compared with last month. The last month value can be obtained by A1(#-1), and the growth amount can be obtained by being subtracted by the current month value ~, and then get the maximum difference.
In the calculation of loop function, it is very common to refer to an adjacent member. SPL provides a special symbol ~[-1] to represent A1(#-1). The code can be simplified as follows:
A | |
---|---|
1 | =[123,345,321,345,546,542,874,234,543,983,434,897] |
2 | =A1.(if(#>1,~-~[-1],0)).max() |
~[-i] denotes the i-th member before the current member, and ~[i] denotes the i-th member after the current member, that is, A1(#+i). Different from using A(i) to get a sequence member, when using [], the calculated sequence out of sequence range will not report an error, but will get a null value.
Let’s try an example of refering to a member after the current member, to calculate the average sales value of each month and the months before and after it, that is, in the nth month, calculate the average sales value of the (n-1)-th month, the nth month and the (n+1)-th month, which is called moving average. We only calculate a 2-month average for January and December.
A | |
---|---|
1 | =[123,345,321,345,546,542,874,234,543,983,434,897] |
2 | =A1.(avg(~[-1],~,~[1])) |
Note that we can’t simply add up and divide by 3, but use the avg() function, which will correctly handle the situation with null values.
We can also simplify the calculation of Pascal triangle by means of ~[]:
A | B | |
---|---|---|
1 | 5 | =[[1],[1,1]] |
2 | for 3,A1+1 | =B1(A2-1).(~+~[-1]) |
3 | >B1|=[B2|1] |
Here we make use of the convention that ~[-1] will get null when it is out of bounds.
In addition, ~[] can get a continuous subsequence. For example, for the previous data, we want to use the monthly sales sequence to calculate the monthly accumulated sales, the code can be written as follows:
A | |
---|---|
1 | =[123,345,321,345,546,542,874,234,543,983,434,897] |
2 | =A1.(~[:0].sum()) |
~[:0] means from the sequence head to the current member, which is equivalent to A1.to(#). The complete writing of this grammar is ~[a:b], which gets A1.to(#+a,#+b). If a is omitted, members will be retrieved from the beginning, namely A1.to(#+b); If b is omitted, members will be retrieved till the end, namely A1.to(#+a,).
The method of getting a subsequence can also be used for the previous example of getting the moving average:
A | |
---|---|
1 | =[123,345,321,345,546,542,874,234,543,983,434,897] |
2 | =A1.(~[-1:1].avg()) |
Similar to loop statements, loop functions may be nested in multiple layers. For example, we calculate the sum of the products of each two members in two sequences.
It’s not hard to write with loop statements:
A | B | C | |
---|---|---|---|
1 | =[4,3,2,8,7] | =[9,2,6,1,0] | =0 |
2 | for A1 | for B1 | >C1+=A2*B2 |
But if we write with a loop function, there will be an obstacle:
A | B | |
---|---|---|
1 | =[4,3,2,8,7] | =[9,2,6,1,0] |
2 | =A1.sum(B1.sum(~*~)) |
This writing is obviously wrong. The innermost * will calculate one square. We originally wanted to multiply the ~ of outer A1 by the ~ of inner B1, but now there is only one ~ symbol, and we can’t distinguish it.
This problem can be solved by introducing an intermediate temporary variable:
A | B | |
---|---|---|
1 | =[4,3,2,8,7] | =[9,2,6,1,0] |
2 | =A1.sum((a=~,B1.sum(a*~)) ) |
In addition, SPL also stipulates that adding a variable name before ~ can represent specified ~, while ~ without a variable name will represent the innermost layer ~, which makes it easy to write:
A | B | |
---|---|---|
1 | =[4,3,2,8,7] | =[9,2,6,1,0] |
2 | =A1.sum(B1.sum(A1.~*~)) |
A1.~ denotes ~ of outer layer A1, and another ~ without leading variable denotes ~ of inner layer B1.
If the outer loop function is for a sequence without a variable name, it can’t be referenced. In this case, we need to use a variable to assign values in advance and give it a name artificially.
A | B | |
---|---|---|
1 | =to(9) | =A1.sum(to(~,9).sum(A1.~*~)) |
2 | =9.sum(to(~,9).sum(~*~)) |
To calculate the sum of all multiplication terms in the 9*9 table, we need to give to(9) a name, in order to distinguish the layers in the multi-layer loop function. While it is impossible to distinguish the inner and outer layers by writing 9.sum(…) directly, because 9.~ is not a legal formula.
Sometimes the inner and outer layers are the same sequence, even if it is copied to a different variable, they still cannot be distinguished:
A | B | |
---|---|---|
1 | =[4,3,2,8,7] | =A1.sum(A1.sum(A1.~*~)) |
2 | =A1 | =A2.sum(A1.sum(A2.~*~)) |
3 | =A1.(~) | =A3.sum(A1.sum(A3.~*~)) |
The writing of B1 has the same result as writing sum(*)directly, and it is impossible to distinguish the inner and outer layers. The writing of B2 seems to be able to distinguish, but A1 and A2 are actually the same object (review the content of the previous chapter). In the calculation of loop function, ~ is recorded in this sequence, and the inner and outer layers can still not be distinguished by only using different variable names; To write like B3, copy A1 to a new sequence, ~ of A3 and ~ of A1 will be different, and the correct result can be executed.
It’s the same rule for #, and you can try to understand what the following examples are calculating:
A | B | |
---|---|---|
1 | =[3,8,2,4,7] | =[9,2,6,1,0] |
2 | =A1.max(B1.max(A1.~-~)) | |
3 | =A1.max(B1.max(A1(A1.#)-B1(#))) | |
4 | =A1.max(B1.max(A1(#)-B1(A1.#))) |
SPL also has a function A.run (x), which is similar to A.(x). This function also calculates x in turn, but still returns A, not the sequence of x.
What’s the use of this?
x can be an arbitrary expression. We said that = is also a suitable operator. When using A.run(x), we can use ~=… to change itself. For example, A.(~=~*~) will change A into a sequence of the square of its members, which is different from A.(~*~). The latter will return a new sequence, while the former modifies the original sequence.
It still seems there is no much difference? The new generation of sequence is no different from the modification of the original sequence for subsequent calculations.
When only ~ is used for calculation, there is no big difference (there will be a big difference when we talk about records and table sequence later). But if we refer to adjacent members with the help of the [] symbol, the result will be different.
For example, A.run(~=~[-1]+~)will change A into a sequence of cumulative values. While A.(~[-1]+~)is different, it does not calculate the cumulative value, but only the sum of adjacent values. Because a new sequence will be generated in the latter, and ~[-1] and ~ are all of the original sequence A, which will not be changed in the calculation process; However, the former does not produce a new sequence in the calculation process, and ~[-1] will be changed by the last round of calculation, resulting in the effect of cumulative value.
The results of A.run(~=~[-1]+~)and A=A.([:0].sum()) are the same, and both of them can calculate the cumulative values, but the former needs much less calculation, because it is calculated on the basis of the previous round, while the latter needs to calculate from the beginning every time.
You can try it with code.
With this mechanism, we can simplify the calculation of e, abandon the use of a temporary variable, and use only one expression:
=1+20.run(~=~*if(#>1,~[-1],1)).sum(1/~)
The 20.run(~=~*if(#>1,~[-1],1))will get the sequence of factorial values [1!,2!,…,20!]. Before each cycle, ~[-1] is already the factorial value of the previous round, and then *~ is the factorial value of the current round. For the first round of loop, we need to process with if, because SPL specifies that any number multiplied by null results in null, which is different from addition and requires special handling.
With the sequence of factorial values, it is easy to get the result with another step of doing sum.
The power of loop functions is huge, and you can write very simple and elegant code if you use them properly.
SPL Programming - Preface
SPL Programming - 5.2 [Sequence as a whole] Loop functions
SPL Programming - 5.4 [Sequence as a whole] Iterative function*
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL