AUC,GINI,KS
AUC, GINI, KS are usually used to evaluate the overall performance of the model
Calculate AUC,GINI,KS on the prediction data “titanic_export.csv”. The variables used in the calculation are “Survived_1_percentage” and “Survived”
SPL code:
A |
B |
|
1 |
=T("D://titanic_export.csv") |
|
2 |
=A1.(Survived_1_percentage).ranks() |
|
3 |
=A1.derive(A2(#):rank) |
|
4 |
=A3.groups(Survived;sum(rank):sum_rank,count(~):count) |
|
5 |
=(A4(2).sum_rank-A4(2).count*(1+A4(2).count)/2)/(A4(2).count*A4(1).count) |
/auc |
6 |
=2*A5-1 |
/gini |
7 |
=A1.sort@z(Survived_1_percentage) |
|
8 |
=A7.len()\10+1 |
|
9 |
=A7.derive(#\A8:decile) |
|
10 |
=A9.groups(decile;count(Survived==1):event,count(Survived==0):non_event) |
|
11 |
=A10.derive(event+cum_event[-1]:cum_event,non_event+cum_non_event[-1]:cum_non_event) |
|
12 |
=A11.derive(cum_event/A4(2).count-cum_non_event/A4(1).count:ks) |
|
13 |
=A12.max(ks) |
/ks |
A5 Return AUC
A6 Return GINI
A13 Return KS
The calculation method of these three indicators is more complex, interested readers please consult the relevant information, this book only provides calculation code for readers to use.
About the using method of each index can see the model evaluation section of this course
Data Mining Course (raqsoft.com)
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/2bkGwqTj
Youtube 👉 https://www.youtube.com/@esProc_SPL