Stratified split
The proportion of class 0/1 (target variable) in the training set and the test set is close to the same
A |
|
1 |
=file("D://titanic.csv").import@qtc() |
2 |
=A1.group@p(Survived) |
3 |
=A2(1).group(rand()<=0.3) |
4 |
=A2(2).group(rand()<=0.3) |
5 |
=(A3(1)|A4(1)).sort() |
6 |
=(A3(2)|A4(2)).sort() |
7 |
=train=A1(A5) |
8 |
=test=A1(A6) |
A2 Divide the samples into two groups according to the Survived value 0/1
A3 The first group is divided into two groups with 7:3 ratio
A4 The second group is divided into two groups with 7:3 ratio
A5 70% of the groups with target variables 0 and 1 are taken to form the training set
A6 Groups with target variables 0 and 1 take 30% each to form a prediction set
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL