Group & Summarize a Big Excel File Every N Rows
Title: Python / Excel 15 Minute Timestamps to 1 Hour and Sum Vlues
【Question】
I have a Dataset with 15 Minute Timestamps and Values
I need it hourly like this (summed values all 4 rows):
How to do this?
My Dataset has 127000 rows…3 Years, so manually is not an option.
Thanks for your help.
【Answer】
Instead of grouping rows by same column values, you give a special grouping condition. You want to group data every fixed number of rows (N) and then perform aggregation.
Excel doesn’t have a direct method to do this. It uses the combination of offset function and index function or the like. But those functions are irrelevant with handling fixed number of rows. So the solution is hard to understand.
I noticed that you want to do it in Python. As both the source Excel file and the result Excel file are big, they may not fit into the memory. It would be the best to process the file in a circular way. You read 4N rows each time rather than the whole data at once, group and summarize the rows, append the result to the target result Excel, and then read in the next 4N rows to process. But Python hasn’t a direct external computation mechanism. It’s complex to write code manually.
esProc SPL is also able to process Excel data. It is particularly good at handling various unusual grouping scenarios. It also provides the cursor mechanism to produce simple code. For your question, the SPL script is like this:
A |
B |
C |
|
1 |
=file("D:\data.xlsx").xlsimport@c() |
||
2 |
for A1,4*1000 |
=A2.group@i(#%4==1).new(~(4)._1,~.sum(_2)) |
/ Read in and process 4000 rows at a time |
3 |
=file("D:\new.xlsx").xlsexport@a(B2) |
/ Append result to the target Excel file |
SPL is good at handling operations on xls or csv files. Find more examples and learn more about SPL in http://www.raqsoft.com/p/script-over-csv-xls.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_Desktop/
SPL Learning Material 👉 https://c.scudata.com
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProcDesktop
Linkedin Group 👉 https://www.linkedin.com/groups/14419406/