Join Up Multiple Same-format 2D Tables
【Question】
Is there a Java library that would allow me to parse CSV files that have headers defined on certain lines? Here's an example for such a CSV:
$ID$,$Customer$
Cust1, Jack
Cust2 , Rose
$Name$,$Location$
Sherlock,London
Clouseau,Paris
The "$" symbol indicates the presence of headers on that line, and the values in subsequent rows map to these headers.
【Answer】
Your question: Each two-dimensional table has same number of rows where the first is the headers. You need to join up the two-field two-dimensional tables into a wider standardized one. The algorithm is like this: Group data into multiple two-dimensional tables according to whether a row has the "$" symbol; create an empty 2D table whose headers are values of the first rows; beginning from the 2nd, get rows with same sequence numbers and union them in order; then enter values to the empty 2D table.
The algorithm involves grouping operation, order-based operation and dynamic 2D table. It’s really difficult to code it in Java. But it’s simple to achieve it in SPL (Structured Process Language):
A |
|
1 |
=file("d:\\source.csv").import@c() |
2 |
=A1.group@i(left(#1,1)=="$") |
3 |
=create(${A2.conj(~(1).array()).concat@c()}) |
4 |
=to(2,A2(1).len()).conj((t=~,A2.(~(t).array()).conj())) |
5 |
=A3.record(A4) |
A1: Import source.csv as a 2D table.
A2: Grouping; put rows from one containing $ to another containing the symbol into same group (the second symbol row will be put into the next group).
A3: Create a new 2D table where the column headers are values of the first row in each group.
A4: Beginning from the 2nd row, get rows with same sequence numbers from the groups each time and union them in order as a sequence.
A5: Populate members of A4’s sequence in order into A3’s table sequence row by row.
The SPL script is integration-friendly. See How to Call an SPL Script in Java to learn how to integrate it with a Java application.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL