Looking for the Best Report Data Source Hot-Swapping Technique
The reporting application is liable to change. Frequent changes to the data source (data preparation process) are one aspect. It is crucial for reporting applications to hot-swap reports without terminating the application. In this essay, we’ll look at and compare several hot-swapping techniques in terms of report data source changes.
SQL
It is fairly easy to achieve HotSwap if we implement the report data source preparation using SQL in the database. Most reporting tools have interpreted execution templates. Developers edit the data preparation SQL statement (which is usually packaged into the report template) and update the template to bring about the real-time report modification, that is, the hot-swap. Things are more complicated for modifying stored procedures. On most occasions, changing the stored procedure will trigger the database to perform automatic compilation accordingly (at the first access). On special occasions, manual compilation is needed and hot-swap cannot be realized.
And SQL has limitations that compromise its applicability. Most databases can only handle internal data (Certain databases support processing external tables but with a lot of limitations, such as nonsupport of index, read-only, and unsatisfactory performance) and are unable to deal with diverse/multiple data source scenarios requiring mixed computing. It is difficult to code complicated computations in SQL, and moreover, databases have different degrees of support (even no support) for stored procedures.
Sometimes in order to get more independent computing ability and obtain good scalability and migratability for the application, we do not use SQL, particularly the stored procedure, to do computations. Instead, we move data processing logic upward to the application level – the reporting application we are saying in this essay. This means that SQL, only on certain occasions, is suitable for achieving report data source hot-swap, but that it is not universal.
Reporting tools
Some reporting tools have some data preparation abilities, such as Script Data Source, which enables developers to write scripts for data preparations. Generally, the script is written in JavaScript (JS) syntax. It is hot-swappable as it can be updated in real-time whenever modified. Yet JS lacks special computing libraries for processing structured data. It is hard for it to implement complicated report data source preparations, and Java is needed.
Take BIRT as an example. The reporting tool provides a script data source type. Users create a script data source, perform a series of operations, like introducing a class library and defining variables, in the open script, and implement data preparation in the fetch script. The user-defined script is able to handle data coming from diverse/multiple sources and is thus more open than SQL.
After data is prepared, we create a data set. The data set is configured manually and should have same structure as the source table. Both column names and corresponding data types need to be consistent. It is inconvenient if the data structure is dynamic.
Maintain consistent structure in the data set and the script data source
You can also introduce a Java class into the script data source and use Java’s computing ability to handle more complicated data preparations. But as a compiled language, Java is not for hot-swap (We’ll discuss Java hot-swap later).
Though script data source, compared with the SQL solution, is able to take care of diverse/multiple data sources, it is only suitable for specific scenarios (the simple ones) due to insufficient computing ability. On the other hand, not all reporting tools offer the script data source feature, and users need to turn to other methods to realize data source hot swap.
Java
Most reporting tools supply data source extension interface to enable users to define their own data processing logic conveniently. Taking Java interface as an example, we’ll look at the hot-swapping technique for report data sources.
As a compiled language, Java needs to be first compiled before execution. Java HotSwap issue has been the subject of repeated discussions since JDK1.4 era and still remains unresolved. Here we are looking at three popular hot-swapping techniques for Java.
HotSwap isn’t a default configuration in Java. JavaClass is loaded through ClassLoader before execution. Class is uniquely identified under namespace in ClassLoader. One ClassLoader cannot load class with a same identifier repeatedly, and a restart will be enabled for it to load a new class. In view of this, we define multiple ClassLoaders, monitor changes of the file, load each new file through a different ClassLoader, fix up the file, and unload the last ClassLoader. In this way, the dynamic deployment of Tomcat is achieved by monitoring the changes of war, calling StandardContext.reload() to load the war through a new WebContextClassLoader instance, and then initiating servlet. OSGi uses the same way, too.
The second way to achieve Java hot-swap is using JReble or spring-loaded. This type of technique loads Class using ClassLoader and modifies byte codes using bytecode instrumentation. That is to say, the loaded Class is replaced by the Frame’s Runtime and thus hot-swap is achieved. The problem is that JRebel is not convenient to deploy and use (See https://www.jrebel.com/).
The third solution is to change the abstraction on VM level to make the system support dynamic Class at root. This involves the use of Dcevm (Dynamic Code Evolution VM), which generates patches for Java HotSpot to allow redefining the to-be-loaded Class unlimitedly for realizing hot swap. The hot-swap agent has a rather high user threshold (Refer to https://github.com/dcevm/dcevm for details).
Overall, it is complicated to use both user-defined-program-based hot-compile technique and other tools to achieve Java hot swap. A complicated hot-swapping process is too heavy for the data preparation, and it is not suitable for developers. Fortunately, there is a simpler and more convenient alternative.
esProc SPL
esProc is an open-source, light-weight computing engine specializing in structured data computations. The tool offers a great wealth of computing class libraries to handle complicated report data source preparations between the data source and the would-be report. Its role and place in the report development architecture are similar to those of script data source and Java user-defined data source. It can achieve cross-data-source mixed computing by supporting a variety of data sources (including RDB, NoSQL, JSON, CSV, Webservice, and etc.).
esProc is based on SPL (Structured Process Language), the independent computing syntax, which enables stepwise data preparation process. The interpreted execution language is great at achieving report data source hot swap.
SPL has succinct syntax, as the following computation shows:
We are trying to find stocks that rise consecutively for at least five days, count the corresponding rising dates (record equal prices as rising), and display results in a report.
Below is the SPL script (stock.dfx):
A |
||
1 |
=connect@l("orcl").query@x("select * from stock_record order by ddate") |
|
2 |
=A1.group(code) |
|
3 |
=A2.new(code,~.group@i(price<price[-1]).max(~.len())-1:maxrisedays) |
Count the consecutive rising dates for each stock |
4 |
=A3.select(maxrisedays>=5) |
Get the eligible records |
It is easy to integrate the esProc SPL script into the reporting tool. Users just need to introduce the following three jars:
esproc-bin-xxxx.jar // esProc computing engine and JDBC driver jar
icu4j-60.3.jar //Handle internationalization
jdom-1.1.3.jar //Parse configuration file
Establish esProc JDBC data source connection (related information is as follows):
JDBC Driver:com.esproc.jdbc.InternalDriver
JDBC URL:jdbc:esproc:local://
Finally, we call the SPL script from the reporting tool data set (the same way as calling the stored procedure and parameter pass-in is supported):
{call stock()}
For each report data source hot-swapping technique, the database becomes more of a data storage device than a computing engine as the structure of application becomes more complicated. In view of this and in a nutshell, SQL (including the stored procedure) is general but narrowly-applicable. So is the script data source because not all reporting tools offer the feature, and it has only limited computing ability. Java is the most universal one as it is able to deal with all data computing scenarios. Yet, the high-level language is too difficult to use in implementing the hot-swap and, moreover, coding in it is hard due to its lack of class libraries for set-oriented operations. The interpreted execution esProc SPL naturally supports hot-swap. Its SPL syntax is sufficiently succinct, boasts equal computing capability as Java (in terms of handling report data source), yet has a low user threshold. So esProc is the most general and convenient tool for achieving report data source hot swap.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL