Check if Values Are the Same for Each Row over Criteria in Different Other Columns
Question
CREATETABLEcampaigns
(
idSERIALPRIMARYKEY,
campaignVARCHAR,
supplierVARCHAR,
plan_quantityDECIMAL
);
INSERTINTOcampaigns(campaign,supplier,plan_quantity)
VALUES('C001','supplier_a','500'),
('C001','supplier_a','500'),
('C001','supplier_b','500'),
('C002','supplier_a','600'),
('C002','supplier_b','700'),
('C003','supplier_c','100'),
('C003','supplier_c','100'),
('C004','supplier_a','900'),
('C004','supplier_c','800'),
('C004','supplier_d','250'),
('C004','supplier_d','250'),
('C005','supplier_b','380'),
('C005','supplier_b','270'),
('C005','supplier_d','590');
Expected result:
campaign|supplier|plan_quantity|check|
----------|--------------|-------------------|--------------------|-------
C001|supplier_a|500|same|
C001|supplier_a|500|same|
C001|supplier_b|500|non-relevant|
----------|--------------|-------------------|--------------------|-------
C002|supplier_a|600|non-relevant|
C002|supplier_b|700|non-relevant|
----------|--------------|-------------------|--------------------|-------
C003|supplier_c|100|same|
C003|supplier_c|100|same|
----------|--------------|-------------------|--------------------|-------
C004|supplier_a|900|non-relevant|
C004|supplier_c|800|non-relevant|
C004|supplier_d|250|same|
C004|supplier_d|250|same|
----------|--------------|-------------------|--------------------|-------
C005|supplier_b|380|different|
C005|supplier_b|270|different|
C005|supplier_d|590|non-relevant|
In case a supplier appears multiple times per campaign, I want to see in column check if the plan_quantity for this supplier is the same in every row.
I am getting close to the result when I modify the query for this question:
SELECT
campaignAScampaign,
supplierASsupplier,
plan_quantityASplan_quantity,
(CASE
WHENMIN(plan_quantity)OVER(PARTITIONBYsupplier,campaign)=MAX(plan_quantity)OVER(PARTITIONBYsupplier,campaign)
THEN'same'
ELSE'different'
END)AScheck
FROM
campaigns
ORDERBY
1,2,3;
However, I have no clue how I can add the description non-relevant to the query in case a supplier does not appear multiple times per campaign.
Do you have any idea?
Answer
Try to work out a solution according to our natural of thinking. To get this done, we just need to add a check column when its value is non-relevant to the ordered records in column 2, column 3 and column 4. First, group records by column 2 and column 3. If a group has more than one records, change the value of check column to different and group records in this group by column 4. Then, if a subgroup has more than one records, change the value of check column to same. It is roundabout to code the process in SQL. In similar cases, we need a multilevel case statement to work with multiple window functions. The final SQL statement is difficult to understand. The usual way is to fetch the original data out of the database and process it in Python or SPL. SPL, the open-source Java package, is easier to be integrated into a Java program and generate more concise code. It finishes the task with only two lines of code:
A |
|
1 |
=PG.query("SELECT *,'non-relevant'as check FROM CAMPAIGNS ORDER BY 2,3,4") |
2 |
>A1.group@o(#2,#3).(if(~.len()>1,~.run(check="different"),~).group@o(#4).(if(~.len()>1,~.run(check="same")))) |
View SPL source code.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL
Chinese version