Which open source package is best for Java to parse and process json?
There are many JSON libraries available in Java. Most open-source packages completely deserialize JSON into Java objects and then access the property values of interest in the objects, such as Jackson, GSON, Genson, FastJson, and org.json. There is also a type of JSON library, such as JsonPATH. Java calls a specific DSL grammar that traverses JSON objects with XPath expressions and then calls the maximum value, average value, summary value, and other functions at the end of the path to calculate the entire JSON document. If it is for the needs of Json data calculation, it is more convenient to use Open-esProc. But Open-esProc is different from general Java packages. It encapsulates the data types and calculation methods in a scripting language called SPL and then calls the SPL script in the Java program to return a ResultSet object.
To give a simple example, the file EO.json stores a batch of employee information and multiple orders belonging to employees. Some of the data is as follows:
[{
"_id": {"$oid": "6074f6c7e85e8d46400dc4a7"},
"EId": 7,"State": "Illinois","Dept": "Sales","Name":"Alexis",
"Gender": "F","Salary": 9000,"Birthday": "1972-08-16",
"Orders": [
{"OrderID": 70,"Client": "DSG","SellerId":7,
"Amount": 288,"OrderDate": "2009-09-30"},
{"OrderID": 131,"Client": "FOL","SellerId":7,
"Amount": 103.2,"OrderDate": "2009-12-10"}
]
}
{
"_id": {"$oid": "6074f6c7e85e8d46400dc4a8"},
"EId": 8,"State": "California", ...
}]
SPL script handles a conditional query in the following way:
A |
|
1 |
=json(file("D:\\data\\EO.json").read()) |
2 |
=A1.conj(Orders) |
3 |
=A2.select(Amount>500 && Amount<=2000 && like@c(Client,"*bro*")) |
SPL reads in JSON data as a multilevel table sequence object , concatenates all orders via conj function, and performs the conditional query through select function.
This block of code can be debugged or executed in esProc IDE, and stored as a script file (like condition.dfx) for invocation from a Java program through the JDBC interface. Below is the code for invocation:
package Test;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;
public class test1 {
public static void main(String[] args)throws Exception {
Class.forName("com.esproc.jdbc.InternalDriver");
Connection connection =DriverManager.getConnection("jdbc:esproc:local://");
Statement statement = connection.createStatement();
ResultSet result = statement.executeQuery("call condition()");
printResult(result);
if(connection != null) connection.close();
}
…
}
This is similar to calling a stored procedure. SPL also supports the SQL-like way of embedding the code directly into a Java program without the need of storing it as a script file. Below is the code for embedding:
…
ResultSet result = statement.executeQuery("=json(file(\"D:\\data\\EO.json\").read())
.conj(Orders).select(Amount>500 && Amount<=3000 && like@c(Client,\"*bro*\"))");
…
SPL achieves grouping & aggregation operations and join operations in the following way:
A |
B |
|
1 |
=json(file("D:\\data\\EO.json").read()) |
|
2 |
=A1.conj(Orders) |
|
3 |
=A2.select(Amount>1000 && Amount<=3000 && like@c(Client,"*s*")) |
/Conditional query |
4 |
=A2.groups(year(OrderDate);sum(Amount)) |
/Grouping & aggregation |
5 |
=A1.new(Name,Gender,Dept,Orders.OrderID,Orders.Client,Orders.Client,Orders.SellerId,Orders.Amount,Orders.OrderDate) |
/Join operation |
As the above code shows,SPL has the most powerful syntactic expressiveness that enables handling common operations, generates concise and easy-to-understand code, and facilitates easy integration. The programming language gives intuitive support for operators to be able to retrieve values directly from multilevel data during a join, which further compresses the code.
The outstanding syntactic expressiveness simplifies computations of multilevel JSON data. Let’s look at an example. JSONstr.json’s runners field is the subdocument, which consists of three fields – horseId, ownerColours and trainer. The trainer filed has a subfield trainerId and ownerColors contains comma-separated arrays. Below is part of the source data:
[
{
"race": {
"raceId":"1.33.1141109.2",
"meetingId":"1.33.1141109"
},
...
"numberOfRunners": 2,
"runners": [
{ "horseId":"1.00387464",
"trainer": {
"trainerId":"1.00034060"
},
"ownerColours":"Maroon,pink,dark blue."
},
{ "horseId":"1.00373620",
"trainer": {
"trainerId":"1.00010997"
},
"ownerColours":"Black,Maroon,green,pink."
}
]
},
...
]
The task is to group data by trainerId and count members of ownerColours in each group. Below is the SPL for doing this:
A |
|
1 |
=json(file("/workspace/JSONstr.json").read()) |
2 |
=A1(1).runners |
3 |
=A2.groups(trainer.trainerId; ownerColours.array().count():times) |
SPL provides great data source support. It has the special function to retrieve JSON data from a variety of data sources, including files, MongoDB, Elasticsearch, WebService, etc.
JSON data read and write is one of SPL's basic features, so users do not need to make specific deployment (unless they need to retrieve data from certain data sources, such as MongoDB).
Connect MongoDB to calculate Json data, refer to How to perform SQL-like queries on MongoDB in Java?
For more Json calculation examples, refer to Json data calculation and importing into database.pdf
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL