How to Count Distinct Users in esProc SPL?

 

Hi everyone,

I am working with esProc SPL and SQL to handle complex data processing tasks. I have a dataset from a weblogs table and I need to count the distinct users who visited specific pages within a certain date range.

userID logDate page
0070e9c1-0815-4151-8f9c-091176d223fc 14/01/2023 blog
00819d78-7688-4fd2-979c-71537abae9c8 21/01/2023 contact
00b3022a-6e4e-44d8-aded-34dd106f0fc9 26/01/2023 faq
00c66ad5-ce83-432b-af2f-2c07b35eb61b 20/01/2023 support
00cd9ca9-c57f-4fe6-80a4-71a49b19506c 29/01/2023 support
01880a65-f040-429e-9a28-99f651b2d6ff 22/01/2023 services
019579a3-4758-4552-97ce-98423f66da22 31/01/2023 products
01ae2eb6-59e6-4a3b-ae86-51ad12529325 04/01/2023 products
01cabbcf-4dab-4773-97b4-04c735f8c782 25/01/2023 terms
01cc5956-5436-47cb-b02c-36c8f9139530 19/01/2023 privacy
021dfab6-de8a-45cc-831b-bf36774fa117 17/01/2023 terms
02289b5a-2d79-4cce-821f-7cbbe8181048 25/01/2023 home

SQL Approach:

Copy code
SELECT page, COUNT(DISTINCT userID) AS unique_visitors
FROM weblogs
WHERE logDate BETWEEN '2023-01-01' AND '2023-01-31'
GROUP BY page;

This SQL query filters the logs by date, groups the results by page, and counts the distinct user IDs for each page. I am trying to achieve the same result using esProc SPL.

imagepng

My current esProc SPL code:

= connect("com.mysql.jdbc.Driver", "jdbc:mysql://127.0.0.1:3306/world?user=root&password=5683")
= A1.query("SELECT * FROM `weblogs`")
= A2.select(date(logDate) >= date("2023-01-01") && date(logDate) <= date("2023-01-31"))
= A3.group(page; ~.userID.distinct().count():unique_visitors)

However, I’m not sure if this is the correct way to count distinct users for each page. Can someone help me verify or correct this code?

imagepng

Thanks in advance for your help!