Removing Empty Lines in Parsing Structured Data
【Question】
I have a text data like
name = abc
id = 123
Place = xyz
Details = some texts with two line
name = aaa
id = 54657
Place = dfd
Details = some texts with some lines
I need to place them in a table or csv and my output should look like:
name id Place Details
abc 123 xyz Some texts with two line
aaa 54657 dfd Some texts with some lines
【Answer】
I don’t know how many empty lines in your text data, but you just need to delete the empty lines, retrieve the data to the right of the equal signs, group it every 4 lines and populate each group to an empty two-dimensional table. Since it’s too complicated to hardcode the process in Java, you can program it in esPorc SPL and then integrate the SPL script via JDBC. Here’s the SPL script:
A |
|
1 |
=file("D:\\source.txt").import@i() |
2 |
=A1.select(~).(~.split@t("=")(2)) |
3 |
=A2.group((#-1)\4) |
4 |
=A3.new(~(1):name,~(2):id,~(3):place,~(4):details) |
A1: Import the text data as a sequence whose members are all the lines;
A2: Split each member in A1 into a sequence according to the separator "=", trim the spaces at the two ends of each of the two members, and then return the second members, that is, the data to the right of the equal sign;
A3: Group A2’s sequence every four rows;
A4: Create a new table sequence made up of fields name, id, place and details according to A3’s sequence to hold the final result set;
You can export A4’s result set to a text file directly:
A5=file("D:\\result.txt").export@t(A4)
Or update the result set to the database:
A6=myDB1.update@i(A4, tableName,name:name,id:id,place:place,details:details;id)
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL