Java, perform COUNT on each group of a large csv file
data.csv is a large csv file that cannot fit into the memory; its 3rd column is the grouping column, as shown below:
Date,Time,Sub User,Access Method
10-10-2023,00:03:06,JL,cli
10-10-2023,00:02:20,TW2JL,app
10-10-2023,00:03:26,JL,cli
10-10-2023,00:03:34,JL,cli
10-10-2023,00:03:35,JL,cli
10-10-2023,00:03:46,JL,cli
10-10-2023,00:04:09,JL,cli
10-10-2023,00:04:51,JL,cli
10-10-2023,00:04:56,JL,cli
10-10-2023,00:05:58,JL,cli
10-10-2023,00:06:29,JL,cli
10-10-2023,00:06:42,JL,cli
10-10-2023,00:26:35,TW2JL,app
10-10-2023,00:30:01,TW2JL,app
10-10-2023,00:30:02,TW2JL,app
10-10-2023,00:30:05,TW2JL,app
10-10-2023,00:33:42,TW2JL,app
10-10-2023,00:36:36,TW2JL,app
10-10-2023,00:45:10,TW2JL,app
10-10-2023,00:53:01,TW2JL,app
10-10-2023,00:53:24,TW2JL,app
10-10-2023,01:03:14,TW2JL,app
10-10-2023,01:03:18,TW2JL,app
10-10-2023,01:03:20,TW2JL,app
Task: Use Java to group values in the 3rd column and count record in each group. Below is the expected result:
Sub User cnt
JL 11
TW2JL 13
Write the following SPL statement:
=T@c(""data.csv"").groups("'Sub User"';count(1):cnt)
T()function parses the csv file; @c option enables using the cursor mode. groups() function performs grouping and aggregation.
Read How to Call a SPL Script in Java to find how to integrate SPL into a Java application.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL