SQL Query over File Examples
【Abstract】
This article illustrates scenarios of querying files directly in SQL and provides sample programs written with esProc SPL.
It’s convenient and efficient to manipulate database data in SQL. It’s natural to load data into the database and handle it in SQL when we trying to compute files. The problem is that data loading itself is not simple. It would be extremely convenient if there was a way to perform SQL query directly on files. Here I’ll bring in such a tool, esProc SPL, list various SQL file query scenarios and offers esProc example programs. esProc is a specialized data computing engine, and SPL, abbreviation of Structured Process Language used by esProc, boasts a complete set of SQL file query methods.
This article uses text files in all examples. In fact the methods also apply in Excel files.
1.Filtering
Select records from a text file on a specified condition in SQL.
Example: Select scores of students in class 10 from students score table Students_scores.txt. In the file, the first row contains column names and detailed data begins from the second row:
A |
|
1 |
$select * from E:/txt/Students_scores.txt where CLASS=10 |
2. Aggregation
Summarize data in a text file in SQL.
Example: Calculate the average Chinese score, the highest math score, and the total of English scores based on the student scores table.
A |
|
1 |
$select avg(Chinese),max(Math),sum(English) from E:/txt/Students_scores.txt |
3. Inter-column calculations
Perform inter-column calculations in a text file in SQL.
Example: Calculate the total score of each student in the student score table.
A |
|
1 |
$select *,English+Chinese+Math as total_score from E:/txt/students_scores.txt |
Below is A1’s result, where a computed column total_score is added.
4.CASE statement
We can perform complicated conditional queries in SQL using CASE statement.
Example: Find whether or not the English result of each student in the student scores table is a Pass.
A |
|
1 |
$select *, case when English>=60 then 'Pass' else 'Fail' end as English_evaluation from E:/txt/students_scores.txt |
Below is A1’s result, where a computed column English_evaluation is added.
5.Sorting
Sort data in a text file in ascending (or descending) order in SQL.
Example: Sort the student scores table by class in ascending order and by total score in descending order.
A |
|
1 |
$select * from E:/txt/students_scores.txt order by CLASS,English+Chinese+Math desc |
6.TOP-N
Get Top-N over a text file in SQL.
Example: Get the records of the three students who have the highest English scores.
A |
|
1 |
$select top 3 * from E:/txt/students_scores.txt order by English desc |
7. Grouping & aggregation
Perform grouping & aggregation on data in a text file in SQL.
Example: Find the lowest English score, the highest Chinese score and the total math score in each class.
A |
|
1 |
$select CLASS,min(English),max(Chinese),sum(Math) from E:/txt/students_scores.txt group by CLASS |
8. Post-grouping filtering
Group and summarize data in a text file and then perform filtering in SQL.
Example: Find the class where the average English score is below 70.
A |
|
1 |
$select CLASS,avg(English) as avg_En from E:/txt/students_scores.txt group by CLASS having avg(English)<70 |
A1’s result.
9. Distinct
Perform distinct operation over data in a text file in SQL.
Example: Get the IDs of all classes.
A |
|
1 |
$select distinct CLASS from E:/txt/students_scores.txt |
10. Distinct Count
Perform count after distinct operation on data in a text file in SQL.
Example: Based on the product information file, count the number of different kinds of products. Below is part of the file:
A |
|
1 |
$select count(distinct PID) from E:/txt/PRODUCT_SALE.txt |
11. Grouping & Count Distinct
Group data in text file and perform count distinct on data in text file in SQL.
Example: Based on sales table, count the days when there are sales records for each kind of product.
A |
|
1 |
$select PID,count(distinct DATE) as no_sdate from E:/txt/PRODUCT_SALE.txt group by PID |
12. Two-table join query
Perform a join query over two text files in SQL.
Example: The product information and the sales information are stored in two text files respectively. We want to calculate the total sales amount of the products whose quantity per order is less than 10. Below are the files’ data structures and their relationship:
A |
|
1 |
$select sum(S.quantity*P.Price) as total |
13. Multi-table join query
Perform a join query over more than two text files in SQL.
Example: The state information, department information and employee information are stored in 3 different text files. We want to find the information of employees of California in HR department.
A |
|
1 |
$select e.NAME as NAME |
14. Multi-table, multilevel join query
Perform a multilevel join query over more than two text files in SQL.
Example: The state information, department information and employee information are stored in 3 different text files. We want to find the information of employees of New York state whose managers come from California.
A |
|
1 |
$select e.NAME as ENAME |
15. Nested subquery
Use very complicated SQL in a nested subquery.
Example: The employee information and the department information are stored in two text files. We want to find the department that has the youngest manager. Below are parts of the two files:
A |
|
1 |
$select emp.BIRTHDAY as BIRTHDAY,emp.DEPT as DEPT |
16. Common table expression
Compute data in a text file using SQL WITH clause.
Example: Based on the same text files in the previous example, find the specified departments (HR, R&D, Sales) from the department table, and count the female employees and calculate the average salary in each of these departments.
A |
|
1 |
$with A as |
Find more examples of agile computations in SPL CookBook.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_Desktop/
SPL Learning Material 👉 https://c.scudata.com
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProcDesktop
Linkedin Group 👉 https://www.linkedin.com/groups/14419406/
Chinese version
Example data file URL: https://img.raqsoft.com.cn/file/2024/07/ba072654ff3d4f4493bcac8d6af5c66e_2.zip