Loop Computations

 

A loop computation computes members of a set according to a specific order. During the loop, we can perform a series of computations from simple ones, like accessing the current member and assigning values to a member, to complex ones, such as the inter-row computation, nested loop, and iterative computation. This essay introduces esProc, a convenient and fast tool of handling loop computations, through a set of sample programs. Looking Loop Computations for details.

 

1.     Compare tables row by row to find the equals

Perform judgements by loop, adding a new member to an existing sequence each time.

Example 1Compare two files that have same number of rows to count the rows with same data. Below is part of the source data:

ID

Predicted_Y

Original_Y

10

0.012388464367608093

0.0

11

0.01519899123978988

0.0

13

0.0007920238885061248

0.0

19

0.0012656367468159102

0.0

21

0.009460545997473379

0.0

23

0.024176791871681664

0.0

SPL script


A

B

C

1

=file("p_old.csv").import@ct()


/ Read in the file first imported

2

=file("p_new.csv").import@ct()


/ Read in the second file imported

3

for A1.len()

=cmp(A1(A3),A2(A3))

/ Compare the two files row by row by   loop

4


=@|B3

/ Union each comparison result with   B4’s value

5

=B4.count(~==0)


/ Count the rows having same data

   Below is A5’s result:

Value

11302

 

2.     Assign values by loop

Loop through members of a set to compute each and assign value to it.

Example 2According to the following sales table, give a  reward of sales amount’s 5% to each salesperson whose performance in 2014 rank the top 10%. Below is part of the source table:

OrderID

Customer

SellerId

OrderDate

Amount

10400

EASTC

1

2014/01/01

3063.0

10401

HANAR

1

2014/01/01

3868.6

10402

ERNSH

8

2014/01/02

2713.5

10403

ERNSH

4

2014/01/03

1005.9

10404

MAGAA

2

2014/01/03

1675.0

SPL script


A

B

1

=connect("db").query("select     * from sales")

/ Connect to the data source to read   in sales table

2

=A1.select(year(OrderDate)==2014)

/ Get data of 2014

3

=A2.groups(SellerId;sum(Amount):Amount)

/ Group A2 by seller and calculate the   total sales amount in 2014

4

=A3.sort@z(Amount).to(A3.len()*0.1)

/ Sort A3 by sales amount in   descending order and get records where the amount ranks top 10%

5

=A4.run(Amount*=1.05)

/ Use A.run() function to loop through   A4’s selected records and give a reward of the sale’s amount’s 5% to each   seller

   Below is A5’ s result:

SellerId

Amount

4

150433.185

3

127878.04

1

102756.759

8

87965.346

 

3.     Complex inter-row computation

After data is grouped, summarize specific columns to get different aggregates while performing inter-row computations.

Example 3According to the user payment details table below, calculate the monthly sum payable in 2014 for each user. Below is part of the source table:

ID

customID

name

amount_payable

due_date

amount_paid

pay_date

112101

C013

CA

12800

2014-02-21

12800

2014-12-19

112102

C013

CA

3500

2014-06-15

3500

2014-12-15

112103

C013

CA

2600

2015-03-21

6900

2015-10-17

The target is to output the payable sum per month in the year of 2014. If no data exists for the current month, the payable sum will be the one in the previous month:

name

1

2

3

4

5

6

7

8

9

10

11

12

CA


12800

12800

12800

12800

16300

16300

16300

16300

16300

16300

16300














A

B

C

1

=file("Payment.txt").import@t().select(year(due_date)==2014)


/   Import data of 2014 from the file

2

=create(name,${12.().concat@c()})

=A1.group(customID)

/A2:   Generate an empty table of 12 months. A3: Group A1 by customer ID

3

for   B2

=12.(null)

/   Generate null values for the 12 months

4


>A3.run(B3(month(due_date))=     amount_payable)

/   Set the payable sum for the corresponding month

5


>B3.run(~+=~[-1])

/   Set null as payable sum of the previous month and calculate the cumulative   sum month by month

6


=A2.record(B2.name|B3)

/   Insert records to the result table

SPL script

   Below is A2’s result:

name

1

2

3

4

5

6

7

8

9

10

11

12

CA


12800

12800

12800

12800

16300

16300

16300

16300

16300

16300

16300













 

4.     Get the largest number of consecutively rising days

Count the frequency of continuous rising of a specific column value during the loop.

Example 4According to the SSE Composite Index records below, get the closing price’s largest number of consecutive rising days in 2019. Below is the part of the source table:

Date

Open

Close

Amount

2019/12/31

3036.3858

3050.124

2.27E11

2019/12/30

2998.1689

3040.0239

2.67E11

2019/12/27

3006.8517

3005.0355

2.58E11

2019/12/26

2981.2485

3007.3546

1.96E11

2019/12/25

2980.4276

2981.8805

1.9E11

SPL script


A

B

1

=file("000001.csv").import@ct()

/ Import the source data file

2

=A1.select(year(Date)==2019).sort(Date)

/ Get records of 2019 and sort them by   date

3

=n=0,A2.max(if(Close>Close[-1],n+=1,n=0))

/ Loop through closing prices to   compare each one with the previous one and add 1 if the current price is   higher, and then get the largest count

    Below is A3’s result:

Value

6

5.     Nested loop

Use loop functions in a nested way.

Example 5The hundred fowls problem: If a rooster is worth five coins, a hen three coins, and three chicks together are worth one coin, how many roosters, hens, and chicks totaling 100 can be bought for 100 coins?

 

SPL script


A

B

1

=to(100/5)

/ Number of roosters that can be   bought

2

=to(100/3)

/ Number of hens that can be bought

3

=33.(~*3)

/ Number of chicks that can be bought

4

=create(Cock,Hen,Chick)

/ Create an empty table to store the numbers   of roosters, hens, and chicks

5

>A1.run(A2.run(A3.run(if(A1.~+A2.~+A3.~==100     && A1.~*5+A2.~*3+A3.~/3==100,A4.insert(0,A1.~,A2.~,A3.~)))))

/ Loop through the number of roosters,   hens, and chicks respectively and insert the result records to A4’s table if   the specific requirement is satisfied. The tilde sign is used to represent   the current member

    Below is A4’s result:

Cock

Hen

Chick

4

18

78

8

11

81

12

4

84

 

6.     Call loop number during the loop

Query a certain file by loop to generate desired information during which the loop number is displayed.

Example 6Query file 2 to output specific information according to key words in file 1.

SPL script


A

B

1

=file("file1.txt").read@n()

/ Read in file 1

2

=file("file2.txt").read@n()

/ Read in file 2

3

=A1.conj(("Q"+string(#)+"."+~)|A2.select(pos(~,A1.~)).(~.words()(1)))

/ Loop through file1’s strings to find   them in each string file 2 and get the first word if it is matched. A2.select   uses “~” to represent the current member of file 2; A1.~ is the current   member of A1. Precede each query result with “Q + sequence number of the   current member of A1”; the sequence number is obtained through “#”

Below is A3’s result:

Member

Q1.   like parks

I

Shelly

Harry

Q2.   went out

Shelly

Q3. go   out

I

Ben

Harry

 

7.     Summarize data in skewness intervals starting from specific positions by loop

Calculate the average within skewness intervals starting from specific positions by loop.

Example 7According to the stock exchange table below, list the average closing price within 20 days for each transaction date during 1.1 – 1.10 in the year of 2020. Below is part of the source table:

Date

Open

Close

Amount

2019/12/31

3036.3858

3050.124

2.27E11

2019/12/30

2998.1689

3040.0239

2.67E11

2019/12/27

3006.8517

3005.0355

2.58E11

2019/12/26

2981.2485

3007.3546

1.96E11

2019/12/25

2980.4276

2981.8805

1.9E11

SPL script


A

B

1

=connect("db")

/Connect to the data source

2

=A1.query("select Date, Close   from   Stock where Code='600036'order by Date")

/ Select records of the targeted stock   and sort them by date

3

=A2.pselect@a(Date>=date("2020/01/01")     && Date<=date("2020/01/10"))

/ Use A.pselect() function to get the   sequence numbers of records from 1.1-1.10 in 2020

4

=A2(A3).derive(A2.calc(A3(#),avg(Close[-19:0])):ma20)

/ Use calc()function to calculate and   return the average closing price within 20 days for each of the specified ten   days. Close[-19:0] gets closing prices from 19 days ago to the current date

Below is A4’s result:

Date

Close

ma20

2020/01/02

38.88

37.35

2020/01/03

39.4

37.50

2020/01/06

39.24

37.64

2020/01/07

39.15

37.79

2020/01/08

38.41

37.90

2020/01/09

38.9

38.03

2020/01/10

39.04

38.16

 

8.     Iterative accumulation

Perform iterative accumulation during the loop and then filtering according to the cumulative total.

Example 8According to the sales table below, count the days needed to bag 20 orders per month in the year of 2014. Below is part of the source table:

OrderID

Customer

SellerId

OrderDate

Amount

10400

EASTC

1

2014/01/01

3063.0

10401

HANAR

1

2014/01/01

3868.6

10402

ERNSH

8

2014/01/02

2713.5

10403

ERNSH

4

2014/01/03

1005.9

10404

MAGAA

2

2014/01/03

1675.0

SPL script


A

B

1

=connect("db").query("select     * from sales")

/ Connect to the data source to read   in the sales table

2

=A1.select(year(OrderDate)==2014)

/ Get records of 2014

3

=A2.sort(OrderDate)

/ Sort the selected records by order   date

4

=A3.select(seq(month(OrderDate))==20)

/ Use seq() function to get sequence   numbers of orders in each month, and then select the record whose sequence   number is 20

Below is A4’s result:

Month

Day

1

20

2

20

3

20

4

18

 

9.     Post-grouping ranking

Get the rank in each group by loop.

Example 9According to the employee table below, get the rank of each employee’s salary in their department. Below is part of the source table:

ID

NAME

DEPT

SALARY

1

Rebecca

R&D

7000

2

Ashley

Finance

11000

3

Rachel

Sales

9000

4

Emily

HR

7000

5

Ashley

R&D

16000

SPL script


A

B

1

=connect("db")     .query("select * from Employee order by DEPT, SALARY DESC")

/ Connect to the data source, read in   the employee table, and sort it by department and salary

2

=A1.derive(rank(SALARY;DEPT):DeptRank)

/ Use rank() function to number the   ordered departments and salaries, and calculate department ranks

Below is A2’s result:

ID

NAME

DEPT

SALARY

DeptRank

2

Ashley

Finance

11000

1

32

Andrew

Finance

11000

1

230

Hannah

Finance

10000

3

24

Chloe

Finance

10000

3

 

10.   Post-grouping tight ranking

Get the tight ranks for specific fields in each group by loop.

Example 10According to the scores table below, get the ranks of all subject scores for the student whose ID is 8 in class one. Below is part of the source table:

CLASS

STUDENTID

SUBJECT

SCORE

Class   one

1

English

84

Class   one

1

Math

77

Class   one

1

PE

69

Class   one

2

English

81

Class   one

2

Math

80

SPL script


A

B

1

=connect("db")   .query("select   * from SCORES where CLASS='Class one'order by SUBJECT,   SCORE   DESC")

/ Connect to the data source, read in   the scores table, and sort it by subject and score

2

=A1.derive(ranki(SCORE;SUBJECT):Rank)

/ Use ranki() function to number the   ordered departments and salaries, and calculate the tight rank for each score

3

=A2.select(STUDENTID==8)

/ Get information of student whose ID   is 8

4

=create(${A3.(SUBJECT).concat@c()}).record(A3.(Rank))

/ Join up tight ranks of scores of all   subjects according to A3’ information

Below is A4’s result:

English

Math

PE

10

4

14

 

11.   Iterative sum

Calculate iterative sums by loop.

Example 11According to the SSE Composite Index records below, calculate the cumulative amount total for each transaction date in 2019. Below is part of the source table:

Date

Open

Close

Amount

2019/12/31

3036.3858

3050.124

2.27E11

2019/12/30

2998.1689

3040.0239

2.67E11

2019/12/27

3006.8517

3005.0355

2.58E11

2019/12/26

2981.2485

3007.3546

1.96E11

2019/12/25

2980.4276

2981.8805

1.9E11

SPL script


A

B

1

=file("000001.csv").import@ct()

/ Import the source file

2

=A1.select(year(Date)==2019).sort(Date)

/ Select records of 2019 and sort them   by date

3

=A2.derive(cum(Amount):CUM)

/ Use cum() function to calculate the   cumulative amount total

Below is A3’s result:

Date

Open

Close

Amount

CUM

2019/01/02

2497.8805

2465.291

9.759E10

9.759E10

2019/01/03

2461.7829

2464.3628

1.07E11

2.046E11

2019/01/04

2446.0193

2514.8682

1.39E11

3.436E11

2019/01/07

2528.6987

2533.0887

1.46E11

4.896E11

2019/01/08

2530.3001

2526.4622

1.23E11

6.126E11

 

12.   User-defined iterative computation

Users define the computing expression and the desired termination condition for the iteration computation performed within the loop.

Example 12According to the sales table below, get the date when the sales target of 150,000 is achieved for the first quarter. Below is part of the source table:

OrderID

Customer

SellerId

OrderDate

Amount

10400

EASTC

1

2014/01/01

3063.0

10401

HANAR

1

2014/01/01

3868.6

10402

ERNSH

8

2014/01/02

2713.5

10403

ERNSH

4

2014/01/03

1005.9

10404

MAGAA

2

2014/01/03

1675.0

SPL script


A

B

1

=connect("db").query("select     * from sales")

/ Connect to the data source and read   in the sales table

2

=A1.select(year(OrderDate)==2014)

/ Select data of 2014

3

=A2.iterate((@+=Amount,     ~~=OrderDate),0,@>150000)

/ iterate() function performs the   iterative computation with the initial value as 0 and returns the desired   target date. Sum sales amounts to the current cell and until the total   reaches 150,000.

Below is A3’s result:

Value

2014/03/25

 

There are more examples in SPL CookBook.