"Compute each of the conditional expressions on a big data table and create a new table if the co .."

mars RaqForum 25 No.
269 View • 2 Years ago

6.18 Order-based grouping: by the neighboring condition – big data

Compute each of the conditional expressions on a big data table and create a new table if the computing result is true.
We have a large log file where logs are output according to datetime. The task is to find the date when the ERROR log level appears the most.

Date	Time	Level	IP	…
2020/1/1	0:00:01	INFO	166.253.153.234	…
2020/1/1	0:00:02	INFO	99.72.133.239	…
2020/1/1	0:00:04	WARN	99.11.105.39	…
2020/1/1	0:00:05	INFO	117.69.80.195	…
2020/1/1	0:00:11	INFO	79.195.137.228	…
…	…	…	…	…

SPL offers @i option to work with cs.group() function to group a huge number of records, during which it creates a new group whenever the next neighboring value in the grouping field changes.

SPL script:

	A
1	=file(“ServerLog.txt”).cursor@t()
2	=A1.group@i(Date[-1] !=Date\|\|Level[-1]!=Level;Date,Level,count(~):Count)
3	=A2.select(Level:“ERROR”)
4	=A3.top(1;ErrorCount)

A1 Create cursor for the log file.
A2 Use @i option in cs.group() function to perform grouping where it generates a new group whenever the condition changes.
A3 Get groups of log level ERROR.
A4 Get the group containing the largest number of continuous ERROR level.

Execution result:

Date	ErrorCount
2020/01/02	4

SPL Official Website 👉 https://www.scudata.com

SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL

SPL Learning Material 👉 https://c.scudata.com

SPL Source Code and Package 👉 https://github.com/SPLWare/esProc

Discord 👉 https://discord.gg/cFTcUNs7

Youtube 👉 https://www.youtube.com/@esProc_SPL

spl-cookbook(224)

eBook

mars • 269 View • 2 Years ago