Split a large csv file into smaller files
A csv file has a size far greater than 5M. Below is part of its data:
OrderID,Client,SellerID,Amount,OrderDate 1,SPLI,219,9173,01/17/2022 2,HU,110,6192,10/01/2020 3,SPL,173,5659,04/23/2020 4,OFS,7,3811,02/05/2023 5,ARO,146,3752,08/27/2021 |
Use Java to do this: Split the file into smaller files, each having a size of about 5M; file names contain ordinal numbers, such as Orders1.csv and Orders2.csv. One record should only be put into one file.
Write the following SPL code:
1 |
=file("d:/OrdersBig.csv") |
2 |
=A1.size()\(1000*1000*5)+1 |
3 |
=A2.(T("d:/Orders" / ~ / ".csv",A1.cursor@ts(;~:A2))) |
A2: Compute the number of smaller files (N) the csv file will be divided into. Symbol \ performs the division and gets only the integer part; +1 makes the size of each smaller file is a bit less than 5M.
A3: Loop from 1 to N: approximately, divide the large file into N parts according to the size; retrieve the ith part each time to write to a new file while automatically ensuring that records are complete.
Read How to Call a SPL Script in Java to find how to integrate SPL into a Java application.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL