Learn performance optimization skills from TPCH tests - Q11

 

I Query Requirement

Q11 queries the value of parts supplied by a certain country in the inventory.

Q11 is characterized by multi-table join query operation with grouping, sorting, aggregation and sub-query operation. Subqueries are located in the HAVING condition of grouping operations.

II Oracle Execution

The query SQL written in Oracle is as follows:

select  /*+ parallel(n) */

         ps_partkey,

         sum(ps_supplycost * ps_availqty) as value

from

         partsupp,

         supplier,

         nation

where

         ps_suppkey = s_suppkey

         and s_nationkey = n_nationkey

         and n_name = 'CHINA'

group by

         ps_partkey

having

         sum(ps_supplycost * ps_availqty) > (

                   select

                            sum(ps_supplycost * ps_availqty) * 0.000001

                   from

                            partsupp,

                            supplier,

                            nation

                   where

                            ps_suppkey = s_suppkey

                            and s_nationkey = n_nationkey

                            and n_name = 'CHINA'

         )

order by

         value desc;  

Where /*+ parallel(n) */ is the parallel query syntax of Oracle, and n is the parallel number.

Script execution time, Unit: seconds

 

Number of parallel

1

2

4

8

12

Oracle

71

52

43

36

33

 

III SPL Optimization

Let's regard the following sub-query as view V:

                   select

                            ps_partkey,

                            sum(ps_supplycost * ps_availqty) as value

                   from

                            partsupp,

                            supplier,

                            nation

                   where

                            ps_suppkey = s_suppkey

                            and s_nationkey = n_nationkey

                            and n_name = 'CHINA'

Then the original primary query equals to

                   select

                            ps_partkey,

                            value

                   from V

                   where value>0.000001*(select sum(value) from V)

 

This V is already a grouped result with a relative small amount, so the calculation of traversing V is much less than that of traversing partsuppand directly.

And we also know that the partsupp table is ordered by the primary key ps_partkey and ps_suppkey, that is, ordered to ps_partkey. The ordered grouping method can be used here to group by this field to improve the performance of computing V.

 

The SPL script is as follows

 


A

1

=now()

2

>name="CHINA"

3

>percent=0.000001

4

=file("nation.btx").import@b().select(N_NAME==   name).derive@o().keys@i(N_NATIONKEY)

5

=file("supplier.ctx").open().cursor@m(S_SUPPKEY;A4.find(S_NATIONKEY)).fetch().keys@i(S_SUPPKEY)

6

=file("partsupp.ctx").open().cursor@m(PS_PARTKEY,PS_AVAILQTY,PS_SUPPLYCOST;A5.find(PS_SUPPKEY))

7

=A6.groups@o(PS_PARTKEY;sum(PS_SUPPLYCOST*PS_AVAILQTY):value)

8

=A7.sum(value)*percent

9

=A7.select(value>A8).sort@z(value)

10

return interval@ms(A1,now())

 

A7 performs ordered grouping with groups@o, which is equivalent to calculating view V. Then A8 and A9 traverse A7 twice to calculate the result.

 

Script execution time, Unit: seconds

 

Number of parallel

1

2

4

8

12

Oracle

71

52

43

36

33

SPL composite table

24

15

9

6

5