Database Column Shuffle & Update
【Question】
What is a best way to shuffle a database column using PL/SQL? E.g. First name with around 70000 of rows.
Here’s a solution:
DECLARE
SUBTYPE Name IS VARCHAR2(200);
TYPE Names_tab IS TABLE OF Name;
--
tabOrigin Names_tab;
tabShuffled Names_tab;
vSwap Name;
idx BINARY_INTEGER;
BEGIN
\-\- Do simple play WITH DUAL instead of real data
WITH rs AS (
SELECT DBMS_RANDOM.STRING('A', 1) name FROM dual
UNION ALL
SELECT DBMS_RANDOM.STRING('A', 1) name FROM dual
UNION ALL
SELECT DBMS_RANDOM.STRING('A', 1) name FROM dual
UNION ALL
SELECT DBMS_RANDOM.STRING('A', 1) name FROM dual
UNION ALL
SELECT DBMS_RANDOM.STRING('A', 1) name FROM dual
UNION ALL
SELECT DBMS_RANDOM.STRING('A', 1) name FROM dual
)
SELECT name origin, name shuffle
BULK COLLECT INTO tabOrigin, tabShuffled
FROM rs
;
--
\-\- As mentioned in 'Operational Notes' for package DBMS_RANDOM
\-\- invocation of DBMS_RANDOM.SEED is necessary only for obtaining
\-\- repeatable sequence of values, therefore omit this call
--
FOR i IN 1 .. tabOrigin.COUNT LOOP
idx := DBMS_RANDOM.VALUE(1, tabOrigin.COUNT);
vSwap := tabShuffled(i);
tabShuffled(i) := tabShuffled(idx);
tabShuffled(idx) := vSwap;
END LOOP;
\-\- Lookup result after shuffling
FOR i IN 1 .. tabOrigin.COUNT LOOP
dbms\_output.put\_line(
'Origin:' || tabOrigin(i) || 'Shuffled to:' || tabShuffled(i)
);
END LOOP;
END;
/
Output for this script is:
Origin: D Shuffled to: N
Origin: U Shuffled to: n
Origin: j Shuffled to: K
Origin: N Shuffled to: D
Origin: K Shuffled to: j
Origin: n Shuffled to: U
【Answer】
With the above solution, you can generate shuffled column in SQL. But you still need further computation to write shuffled values back to SHUFFLED_VALUE field. The whole process is very complicated. In this case, I think you can try SPL. The Structured Process Language (SPL) generates concise and easy-to-understand code:
A |
|
1 |
=connect("myDB1").query("select id,ORIGINAL_VALUE from REF_VALUES") |
2 |
=A1.sort(rand()) |
3 |
=join@p(A1.(ID);A2.(ORIGINAL_VALUE)) |
4 |
=connect("myDB1").update@u(A3,REF_VALUES,ID:_1,SHUFFLED_VALUE:_2 ;ID) |
A1: Retrieve data in SQL;
A2: Perform a random sort over the sequence of column values;
A3: Join A1’s ID field with A2’s ORIGINAL_VALUE field by sequence numbers and return a new sequence;
A4: REF_VALUES 中 Update A3’s column values into the database table according to key ID.
For detailed explanations about functions used in the SPL script, see Function Reference.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL