Concatenate A Set of Indefinite Number of White-space-separated Rows into a Comma-separated String
Problem description & analysis
Below is text file sample.txt:
CHM 110 - 10
9:30 am - 10:45 am
Thursday
CSC 150 - 04
10:30 am - 11:45 am
Monday
11:30 am - 12:45 am
Wednesday
BIO 132 - 01
The above file contains data separated by white spaces. Different parts may have different number of rows. We are trying to join up the rows in each part as a string using the comma and output the result to a CSV file, as shown below:
CHM 110 - 10,9:30 am - 10:45 am,Thursday
CSC 150 - 04,10:30 am - 11:45 am,Monday,11:30 am - 12:45 am,Wednesday
BIO 132 - 01
Solution
Method 1: String
Write the following script p1.dfx in esProc:
A |
|
1 |
=file("sample.txt").read() |
2 |
=A1.split("\n").group@i(~[-1]=="").(~.select(~!="")).concat@nc() |
3 |
=file("result.csv").write(A2) |
Explanation
A1 Read in the txt file as a string.
A2 Split A1’s string into a sequence through the line break, group it by the white space and then remove white spaces, concatenate the sequence of sequences into a large string by comma and line break (When splitting a string into a sequence using the line break, usually \n is used under Unix and MacOS X and \r\n is used for Windows. But, SPL import@i function makes the selection unnecessary).
A3 Write A2’s string into result.csv.
Method 2 (Extension): Table sequence
Write the following script p1.dfx in esProc:
A |
|
1 |
=file("sample.txt").import@i() |
2 |
=A1.group@i(~[-1]==null).(~.select(~)) |
3 |
=A2.(~.len()).max() |
4 |
=create(${to(A3).concat@c()}) |
5 |
=A2.(~.pad(null,A3)).conj() |
6 |
=A4.record(A5) |
7 |
=file("result.csv").export@c(A4) |
Explanation
A1 Read in the txt file as a sequence.
A2 Group the sequence by white space and then remove white spaces.
A3 Get the maximum number of columns based on A2’s sequence.
A4 Create an empty table sequence according to A3’s maximum value.
A5 Pad data to A2’s sequence (because records will be populated to the empty table sequence in order using record function).
A6 Populate data in.
A7 Output data in A4’s table sequence to result.csv.
Read How to Call an SPL Script in Java to learn how to integrate the script code into a Java program.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL