Use Java to perform wordcount
The csv file below has three columns:
title,text,date exampleTitle,This is is is an example example, April 2022 The 2nd Title,This this is an an example 4 K k,April 2023 |
Use Java to do this: Traverse each line to first output the 1st column, and then split the 2nd column as words and output the appearance frequency of each word. Numbers and single letters are not counted; only words with different cases are treated as individual words. Below is the expected result:
exampleTitle an 1 example 2 is 3 this 1 The 2nd Title an 2 example 1 is 1 this 2 |
Write the following SPL script:
A |
B |
|
1 |
for T("data.csv") |
>output(A1.title) |
2 |
=A1.text.words().(lower(~)) |
|
3 |
=B2.groups(~;count(1)) |
|
4 |
=B3.select(len(#1)>1) |
|
5 |
>output(B4.export()) |
A1: Parse the csv file as a two-dimensional table and loop through each row.
B2: Split the text column to get words and convert them to lowercase.
B3: Count the appearances of every word.
B4: Get words having more than one character.
Read How to Call a SPL Script in Java to find how to integrate SPL into a Java application.
Source:https://stackoverflow.com/questions/71804040/how-do-i-count-word-occurrences-in-a-csv-file
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL