4.4 Numberized dimension table
A dimension table where primary key values are natural numbers as ordinals, we can directly use ordinal numbers to achieve the association without creating an index. This avoids computing and comparing HASH values and has better performance.
With a dimension table where primary key values are not natural numbers as ordinals, we can convert them to ordinal numbers in advance, and accordingly, the corresponding dimension field values in the fact table to ordinal numbers, too. Then we can locate data through ordinal numbers.
4.4.1 Dimension table where primary key values are natural numbers as ordinals
SQL
SELECT Employees.Region,sum(Orders.Amount)
FROM Orders
LEFT JOIN Employees ON Orders.EmployeeID=Employees.EmployeeID
WHERE OrderDate>='2021-01-01' and OrderDate<='2021-12-31'
GROUP BY Employees.Region
SPL
A | |
---|---|
1 | >st=date(“2021-01-01”), et=date(“2021-12-31”), start=days@o(st), end=days@o(et) |
2 | >Employees=file(“Employees.btx”).import@b() |
3 | =file(“Orders_Time.ctx”).open().cursor@mx(EmployeeID,Amount;OrderDate>=start && OrderDate<=end) |
4 | =A3.switch(EmployeeID,Employees:#) |
5 | =A4.groups(EmployeeID.Region;sum(Amount):Amount) |
A2 Note: As EmployeeID itself consists of natural numbers as ordinals, data is sorted by EmployeID when dumped to Employees.btx and index on primary key becomes unnecessary at retrieval.
A4 Associate with dimension table Employees through ordinal numbers.
4.4.2 Dimension table where primary key values are not natural numbers as ordinals
SQL
SELECT Products.ProductName, sum(Orders.Quantity)
FROM Orders
LEFT JOIN Products ON Orders.ProductID=Products.ProductID
WHERE OrderDate>='2021-01-01' and OrderDate<='2021-12-31'
GROUP BY Products.ProductName
SPL
Following is code for preparing Orders_Time.ctx, where ProductID is converted to ordinal numbers:
A | |
---|---|
1 | >Products=file(“Products.btx”).import@b().keys@i(ProductID) |
2 | =file(“Orders_Time.ctx”).open().cursor@mx() |
3 | =A2.run(ProductID=Products.pfind(ProductID)) |
4 | =file(“Orders_Time_P.ctx”).create@y(#OrderDate, CustomerID, ProductID, Quantity, Unit, Price, Amount, EmployeeID,EmployeeName,ShipVia) |
5 | =A4.append(A3) |
6 | >A4.close() |
A1 Load dimension table Products to into memory, and set primary key and create index on it.
A3 Convert Orders table’s ProductID field values to corresponding ordinal numbers in the dimension table.
A4 Create a new composite table (with fields in the same order as that in the original composite table).
A5 Write Orders table data to A4’s composite table file.
Aggregation:
A | |
---|---|
1 | >st=date(“2021-01-01”), et=date(“2021-12-31”), start=days@o(st), end=days@o(et) |
2 | >Products=file(“Products.btx”).import@b() |
3 | =file(“Orders_Time_P.ctx”).open().cursor@x(ProductID,Quantity; OrderDate >=start && OrderDate <=end) |
4 | =A3.switch(ProductID,Products:#) |
5 | =A4.groups(ProductID.ProductName; sum(Quantity):Quantity) |
A2 Do not need to create index on the primary key after loading data from dimension table Products.
A4 Perform association through ordinal numbers.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/2bkGwqTj
Youtube 👉 https://www.youtube.com/@esProc_SPL