In addition to POI, what open source package is suitable for processing Excel data in Java?
Java programmers sometimes have to programmatically perform tasks of reading or writing Excel, usually by writing some custom programs to handle them. The open-source Apache POI package is a very useful tool for completing such tasks. POI provides two modes, HSSF and XSSF, to read, create, modify, and write XLS tables. If user requirements involve reading and changing the details of Excel objects, such as batch modification of reading formulas, HSSF can be used to complete detailed manipulation of Excel. If user requirements are more concerned about the efficiency of data reading, the XSSF mode is more efficient. Although POI is powerful, the learning curve is not very flat. For example, suppose you encounter complex file formats, such as merged grids, complex multi-line headers, footers, data records scattered across multiple lines, cross-tabs, etc. The program code to read the data will become longer and more complicated.
Even with a powerful open-source package such as POI, Java is still very troublesome to parse Excel. Usually, after parsing Excel, the data is imported into the database, and the application program then uses the database to complete subsequent calculations. For scenarios that require repeated use of Excel data for calculation, importing a database is cost-effective, but often importing a database is only for calculating data using SQL. In this case, you may wish to use an open-source library that can parse Excel data and do structured calculations. Open-esProc is this type of open-source package, but Open-esProc is different from general Java packages. It encapsulates the data types and calculation methods in a scripting language called SPL and then calls the SPL script in the Java program and returns the ResultSet object.
Take a simple example to illustrate, first use SPL to read the Excel file, and then do a conditional query.
A |
|
1 |
=file("simple.xls").xlsimport@t() |
2 |
=A1.select(amount>500 && amount<=2000) |
The option @t means that the first row is the column heading, and then the select function is used to complete the conditional query.
This code can be debugged and executed in esProc IDE, then saved as a script file (such as condition.dfx), and called in JAVA through the JDBC. The specific code is as follows:
package Test;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;
public class test1 {
public static void main(String[] args)throws Exception {
Class.forName("com.esproc.jdbc.InternalDriver");
Connection connection =DriverManager.getConnection("jdbc:esproc:local://");
Statement statement = connection.createStatement();
ResultSet result = statement.executeQuery("call condition()");
printResult(result);
if(connection != null) connection.close();
}
…
}
The above usage is similar to the stored procedure. SPL also supports the use similar to SQL, that is, without script files, directly embed SPL script in JAVA. The code is as follows:
…
ResultSet result = statement.executeQuery("
=file(\"D:\\data\\simple.xls\").xlsimport@t().select(amount>500 && amount<=3000)");
…
In fact, Open-esProc also encapsulates Apache POI but provides a way to read Excel files more easily. Through SPL professional data processing syntax, not only can read and write Excel very concisely and perfectly support various subsequent calculations. In addition, the SPL calculation process is independent, and it is very convenient to change when the requirement changes.
To give two more examples, sometimes the header format of an Excel file is not simple and often consists of multiple lines. When parsing a file in this format, you need to skip the complicated header and specify to start reading from the position of the data line. Then select the column name of each column of the structured data. For example: In the project cost file itemPrices.xlsx, calculate the total cost of the project, and some of the data are as follows:
A |
Annotations |
|
1 |
=file("e:/excel/itemPrices.xlsx").xlsimport(;1,5) |
Parameter "1,5" means to read the firstsheet, start reading from line 5 and continue to sheet end |
2 |
=A1.rename(#1:No,#2:ItemCode,#3:ItemName,#4:Unit,#5:Quantity,#6:Price,# 7:Sum) |
Edit the column name of each column |
3 |
=A2.sum(Sum) |
Calculate the total cost of the project |
For another example, the data in two data tables need to be associated and calculated. The sales order information and product information are stored in two Excel files, respectively, to calculate the sales of each order and export the results. The data structure of the two files is as follows:
SPL can be directly calculated and exported:
A |
|
1 |
=T(“e:/orders/sales.xlsx”) |
2 |
=T(“e:/orders/product.xlsx”).keys(ID) |
3 |
=A1.join(ProductID,A2,Name,Price) |
4 |
=A3.derive(Quantity*Price:amount) |
5 |
=file("D:/orders/amount.xlsx").xlsexport(A4) |
For more examples of using SPL to process Excel, you can refer to:
Sample Programs of Structuralizing Excel Files
Samples of Generating Various Excel Files
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL