How to Calculate Some Specific Data Function from the Data of a Large CSV File
Question
I'm trying to work out the most expensive county to rent a building from data in a CSV file. The data from each column I need has been put into a list. The price range is set by the user so the outermost For loop and If statement ensure that the buildings considered are in the set price range.
The price of a building is also slightly complicated because the price is the minimum stay x price.
In the code below I am trying to get the average property value of one county just soon I can get the basic structure right before I carry on, but I'm kind of lost at this point. Any help would be much appreciated.
publicintsampleMethod()
{
ArrayList<String>county=newArrayList<String>();
ArrayList<Integer>costOfBuildings=newArrayList<Integer>();
ArrayList<Integer>minimumStay=newArrayList<Integer>();
ArrayList<Integer>minimumBuildingCost=newArrayList<Integer>();
try{
//CodetoreaddatafromtheCSVandputthedatainthelists.
}
}
catch(IOException|URISyntaxExceptione){
//Somecode.
}
intcount=0;
intavgCountyPrice=0;
intcountyCount=0;
for(intcost:costOfBuildings){
if(costOfBuildings.get(count)>=controller.getMin()&&costOfBuildings.get(count)<=controller.getMax()){
for(StringcurrentCounty:county){
for(intcurrentMinimumStay:minimumStay){
if(currentCounty.equals("samplecounty")){
countyCount++;
inttemp=nightsPermitted*cost;
avgCountyPrice=avgCountyPrice+temp/countyCount;
}
}
}
}
count++;
}
returnavgCountyPrice;
}
Here is a sample table to depict what the CSV looks like. Also, the CSV file has more than 50,000 rows.
name |
county |
price |
minStay |
Morgan |
lydney |
135 |
5 |
John |
sedury |
34 |
1 |
Patrick |
newport |
9901 |
7 |
Answer
Let’s describe the algorithm of your task: Group the CSV file by county, calculate the average price in each group, and find the country that has the highest average price for buildings. The code will be rather long if you try to finish the task using Java.
It is convenient and simple to get this done in SPL, the open-source Java package. The language only needs one line of code:
A |
|
1 |
=file("data.csv").import@ct().groups(county;avg(price):price_avg).top(-1;price_avg).county |
SPL offers JDBC driver to be invoked by Java. Just store the above SPL script as mostExpensiveCounty.splx and invoke it in Java in the same way you call a stored procedure:
…
Class.forName("com.esproc.jdbc.InternalDriver");
con= DriverManager.getConnection("jdbc:esproc:local://");
st = con.prepareCall("call mostExpensiveCounty()");
st.execute();
…
View SPL source code.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL
Chinese version