The path for programmers to learn SPL

 

Concept and Syntax

Please start with SPL Programming.
Chapters 1-4 introduce basic program logic, which is mainly aimed at beginners without programming experience. Professional programmers can quickly go through it in a few minutes to understand the basic style and syntax of SPL programming. It is important to pay attention to section 4.4 object understanding.
Chapter 5 is also meaningful for experienced programmers. Here is the set operation thinking of SPL, which is very different from other languages. After understanding and mastering it, you can write elegant code. Afterwards, as long as SPL can do the thing, you usually won’t be interested in using other languages anymore. Especially in Section 5.7, Lambda syntax is optional for beginners and mandatory for professional programmers.
Chapters 8-11 structured data is the focus, and here we approach structured data operations from a different perspective than SQL, which is equally meaningful for professional programmers. From the perspective of SPL, SQL’s understanding of structured data is a bit simple. The world is complex, and the structured data knowledge you have learned from database courses is not broad or deep enough, so you need to relearn!

SPL concepts for beginners and SPL operations for beginners summarize some concepts and operations in SPL programming, and experienced programmers can quickly grasp the characteristics of SPL and its different concepts from SQL.
Understanding Discreteness are blog articles that can be read to understand the design philosophy of SPL, why SPL was invented after there already existed SQL and Java, and the theoretical basis for SPL’s advantages.

Computation Logic

The book focuses on the principles of concepts and does not involve too many functions. To proficiently use SPL to solve data calculation problems, it is necessary to understand commonly used functions. Here is SPL Function Reference.

Of course, familiarity with commonly used functions cannot rely on dictionary like references, but on practice. The book link above includes some practice exercises at the end of each chapter, and there are more comprehensive exercises in Exercise, and there are example datasets available for download in the comments.

General Data Table Operations in SPL and [General Operations on Cursors in SPL] General Operations on Cursors in SPL use text files as data sources to illustrate the common SPL functions.
Furthermore, there are more sample codes available in Calculation Logic for practice and reference. Among them, File Operations lists some common tasks for data files.
The Code Reference lists SPL codes for common operations, including corresponding SPL codes for common SQL statements.
Over a hundred examples are classified and organized in CookBook.
The Application computing section will continuously collect online related questions for answers. These can all be used as code references.

Application Environment

SPL has very good integrability. You can find your application environment integration solution in the Invocation & Integration.
SPL supports a wide range of data sources. You can find the sample codes to access your data source in the Data Source. By mastering these, it is easy to solve multi-source mixed computing tasks.

It should be noted that accessing non-native data sources (data sources aside from files, databases, and HTTP) requires importing external libraries in advance, as outlined in the External Library Guide. It introduced the types and connection methods of external libraries. External Library Function Reference explains the relevant functions for accessing external data sources.

Performance Optimization

This is a challenging content that requires learning high-performance algorithms.
For experienced programmers, it takes two to three hours to get started with the previous content, and then when encountering problems, you can search for information and examples. But high-performance algorithms are not that simple, it requires several days or even one or two weeks to systematically learn and practice!

It is recommended to start practicing with historical data that is no longer changing, without considering the situation where the data is still changing. Also, don’t worry about the framework, only perform calculations on local file data. After understanding and mastering the application of these algorithms, then consider the issues of changing data and framework integration.

How to use SPL storage for beginners has summarized the storage mechanism of SPL, and you can have a general idea first and then further understand in practice. This is the foundation of performance optimization.
General operations on SPL files use SPL files as data sources to introduce SPL implementation of common operations, you can practice using SPL files to store data and implement basic operations, which can clearly demonstrate the performance difference compared to databases.

The high-performance algorithms of SPL are all written in SPL Performance Optimization.
The content of this book is relatively brief. For more detailed information on some algorithms, please refer to High-performance Algorithms, and Technical Subject .
These skills are difficult to master deeply without practice, and you can only start with a rough overview and have a preliminary understanding, then you need to read repeatedly in practice.

The key is to do exercises, SPL Performance Optimization Practice can be used to practice the most common performance optimization techniques through a set of examples, which includes the example datasets.
Then, you can try Performance Optimization Exercises Using TPCH. The coverage of TPCH is not comprehensive, but it has a certain representativeness within the scope of tasks that SQL can describe. There are too many examples of TPCDS and they lack representativeness, so it is not recommended to try them.
Afterwards, you can try the comprehensive cases in High-performance practical cases, which can be of great help in understanding high-performance algorithms.
SPL high-performance practice routine explains the general steps for practicing performance optimization using SPL, but it requires a good understanding of commonly used algorithms and storage mechanisms. After becoming familiar with it, achieving several times the performance improvement is not difficult.