Inner Join of an EXCEPT Subquery on Column Names That Do Not Appear in the Output
Question
I have a table of students and classes. I would like to find what classes were dropped from one semester to another (and a similar query for classes added).
Student Class Semester
==============================
Alice English 11
Alice Geometry 11
Alice English 12
Bob Spanish 11
Bob Spanish 12
My approach is to use an except (same as minus):
select distinct Class
from table
where table.Student = 'Alice'
and table.Semester = 11
except
select distinct Class
from table
where table.Student = 'Alice'
and table.Semester = 12
This works correctly and returns Geometry. However, I need to use this as a subquery like this:
select Student, string_agg(X.Class, ',') as 'Deleted_Classes',
count(X) as 'Num_deleted',
SemesterTable.Semester as semester,
lag(Semester, 1)
over (partition by StudentTable.Student
order by SemesterTable.Semester) as Prev_Semester,
from
StudentTable
SemesterTable
inner join (
<<<Same query from above>>>
) X on _______
where X.Num_deleted > 0
My problems is with the____section - inner joins can only be joined on columns that appear in the output. But, my except query doesn't return values of previous and current semesters (it might even return nothing at all if no classes were dropped). So how do I join the subquery into the main table? My desired output is:
Student Semester Prev Semester Deleted_Classes
========================================================
Alice 12 11 Geometry
Alice appears because she had a change in her schedule, but Bobis omitted because there was no change in his schedule.
Answer
Group rows ordered by Student and Semester according to Student, and in each group, group rows by Semester and perform inter-row calculations. The difference between the set of Classes in the spring semester and that in the fall semester is the newly added classes, and the difference between the fall semester and the spring semester contains the cancelled classes. It is a hassle to code this in SQL because you will need the window function, CROSSAPPLY and OUTERAPPLY. The SQL statement will be lengthy and difficult to understand. An alternative is to move data out of database and process it in Python or SPL. SPL, the open-source Java package, is easy to be integrated into a Java program and generate simple code. It can get this done with only two lines of code
A |
|
1 |
=MSSQL.query@x("select * from Classes order by 1,3") |
2 |
=A1.group@o(#1).conj(~.group@o(#1,#3;~.(#2)).new(Student,#2[+1]:Semester,#2:Prev_Semester,(#3\#3[+1]).concat@c():Deleted_Classes,(#3[+1]\#3).concat@c():Added_Classes).m(:-2)) |
View SPL source code.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL
Chinese version