Macros in SPL
In addition to common static code, sometimes dynamic code is also needed to solve problems, such as generating code (or part of code) based on parameters and dynamically executing it. For programming languages that lack dynamic coding mechanisms, it is usually necessary to write the variable parts of the code in string form. For example, when referencing dataset field names in Python, it is necessary to write them as strings to achieve the effect of dynamic code. However, this will make it inconvenient to read and write more common static code. SQL, on the other hand, can directly write field names (as well as filter conditions, grouping expressions, etc.) in the code without having to write them into strings, making it easier to read and write static code, but it is difficult to handle dynamic code.
SPL inherits the SQL style of static code, allowing for direct writing of code parts, such as field names, without the need to be written as strings. In addition, SPL also provides macros to achieve dynamic code effects.
Example 1: Dynamically sort the order table based on the parameter pSortList, which contains an indefinite number of sorting fields separated by commas.
This dynamic code can be implemented using SPL macros: T("Orders.txt").sort(${pSortList})
Format: ${Calculation expression that returns a string of fields, which can be a variable or constant}
When the SPL engine parses a statement, it first checks whether there are macros. If there are macros, the expressions inside the macros are calculated and concatenated into the original statement, forming a new statement before execution.
More examples
Example 2: Dynamically group the fact table OrdersFact.csv based on multiple dimension fields (and calculated columns) in the parameter pFields, and sum the Amount field. PFields is in the form of 'year (OrderDate),Product,Client'.
SPL code:T("d:/OrdersFact.csv").groups(${pFields};sum(Amount))
Example 3: According to the conditions in the parameter pCondition, dynamically query the order table Orders.xls and select the order records that meet the conditions. PCondition form: year(OrderDate)==2009 && Amount>1000 && Amount<3000
SPL code:T("d:/Orders.xls").select(${pCondition})
Example 4: First, filter the fact table OrdersFact.csv based on the start and end date parameters, and then dynamically group according to the dimension parameter pGroup, pivot by month, and calculate the amount of each month. If pGroup="Product", the calculation result should be as follows:
If pGroup="Area", the calculation result should be as follows:
SPL code:
1 |
=T("d:/OrdersFact.csv") |
2 |
=A1.select(OrderDate>=pBegin && OrderDate<=pEnd) |
3 |
=A2.pivot(${pGroup};year(OrderDate)*100+month(OrderDate),Amount) |
Other uses of macros
Sometimes it is necessary to repeatedly generate code with similar forms, although it is not difficult, it is tedious and laborious, and this situation is also suitable for using macros.
Example 5: A file records the voltages of many devices at various time points. Theoretically, they are floating point values starting from column 2, but there are actually Dirty data with quotation marks. It is necessary to convert these Dirty data to normal floating point values. Partial data:
point |
N_2102 |
N_2104 |
T1_903 |
N_2203 |
… |
00:00:01 |
"221.4" |
220.3 |
224.5 |
221.0 |
… |
00:00:02 |
218.0 |
217.5 |
229.8 |
233.1 |
… |
00:00:03 |
197.4 |
219.8 |
220.0 |
224.4 |
… |
00:00:04 |
207.1 |
201.8 |
230.3 |
207.3 |
… |
00:00:05 |
201.5 |
"227.0" |
224.5 |
203.4 |
… |
It is not difficult to remove the quotation marks from a certain column, such as column 2: A.run(flaot(#2):#2). But if there are many fields, you need to write a lot of repetitive code: A.run(float(#2):#2, float(#3):#3...), which is very laborious. Macros can easily generate duplicate code, making it much easier to implement.
A |
B |
|
1 |
=file("d:/2020-02-01.txt").import@tq() |
Read file,@q removes the quotation mark |
2 |
=to(2,A1.fname().len()) |
List of sequence numbers for the 2nd to Nth fields |
3 |
=A2.("float(#"/ ~/ "):#"/ ~).concat@c() |
Concatenate into a string float(#2):#2, float(#3):#3... |
4 |
=A1.run(${A3}) |
Convert in batches |
5 |
=file("d:/2020-02-01.txt").export@t(A1) |
Write into file |
Matters needing attention
The macro that appears in each statement will only be parsed and replaced one time when it is executed. If the macro expression is still changing during the execution of the statement, it will not be parsed and replaced again, which is different from the variables in the statement.
Example 6: Use another method to implement Example 5, which involves first traversing the field names with a large loop, and then cleaning each field separately with a small loop.
A |
B |
|
1 |
=file("d:/2020-02-01.txt").import@tq() |
|
2 |
=A1.fname().to(2,) |
List of field names |
3 |
=A2.run(A1.run( float(${A2.~}):${A2.~})) |
|
4 |
=file("d:/2020-02-01.txt").export@t(A1) |
The code in A3 above is incorrect because A2.~ will only be parsed once, and only the first field will be converted.
Replace A3 with the following:
A |
B |
|
3 |
for A2 |
=A1.run(float(${A3}):${A3}) |
Although it is still in the loop, B3 will be executed multiple times, and the macro inside will be parsed again each time.
If you want to write it as one statement and the macro expression in the loop is parsed every time, you can use the eval function to write A3 as: A2.run(eval("A1.run(float(" / A2.~ / "):" / A2.~ / ")"))
The performance of the eval function is not as good as that of macros, so it should be used with caution.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL