Structuring Text Data before Filtering
【Question】
I have a long list made up of text like this
Email: example@example.com
Language Spoken: Sample
Points: 52600
Lifetime points: 100000
Country: US
Number: 1234
Gender: Male
Status: Activated
=============================================
I need a way of filtering this list so that only students with higher than 52600 points gets shown. I wanted to know how this could be done in a bat file or some similar solution. I have tried excel but no luck.
【Answer】
Group the text every 8 lines and perform filtering to get groups meeting this condition: the 2nd part of the 3rd member in a group after it is split by the colon should be greater than 52600; and then union the members of those eligible groups. This algorithm involves group operation, order-based operation and structured data computation. You can handle this in SPL effortlessly. Here’s the SPL script:
A |
|
1 |
=file("d:\\data.txt").import@i() |
2 |
=A1.group((#-1)\8) |
3 |
=A2.select(int(substr(~(3),"Points:"))>52600) |
4 |
=A3.conj() |
A1: Read in the text file data.txt, make each line a member and return a sequence;
A2: Group A1’s sequence every 8 members;
A3: Get the Points in the 3rd member in each group from A2’s sequence, convert the value into int type, and select groups where the Points value is greater than 52600;
A4: Union members of A3’s groups.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL