A correct method for complete novice to self-study of programming

Programming is already somewhat hot at present, and it is common to see some people asking how to learn programming on the Internet. Undoubtedly, such people have little experience in programming, even many of them know nothing about it.

To find out a correct method to learn programming, you have to ask yourself the questions first: why should I learn programming? What's my learning purpose?

The reason is very simple, there are various programming languages, and the programming knowledge is so rich and wide that it is impossible for even professionals to know everything. Therefore, you should choose a language to learn based on your purpose.

 

Since this post is mainly for those who do not intend to become a professional programmer, let's first focus on the learning purpose of this group of people. Most of the people who ask such question should hold this attitude, at least for now (maybe they intend to develop towards professional programmer after learning).

The main purpose of this group of people should be to improve work efficiency through the programming and make daily work more automated. Indeed, there are quite a few daily tasks that can be solved easily through programming, but it is very troublesome to do manually, such as merging multiple Excel tables, generating employee card table from the register of name and so on. There are many training courses on the street trumpeting how fast the work speed can be improved after learning XX language, etc., which will, of course, make many office workers swayed.

For this purpose, it needs to learn two levels of knowledge:

1. The basic logic of program

This is what almost all programming languages have, such as variables, branches, loops and so on. Without understanding these logics, it is impossible to code almost any program. However, this level of knowledge is neither complicated nor too much. Those who have the basic knowledge of using Excel formula can learn it with a little effort. Moreover, almost all programming languages have similar capabilities in this regard, and hence as long as you learn one language, you will find it easy to learn the others. Furthermore, most programming languages use similar keywords and syntax rules, therefore, it is easy to draw inferences about other cases from one instance.

Nevertheless, if you only learn this level of knowledge, you can only solve some arithmetic problems of primary and secondary schools, such as solving a chicken-rabbit problem, decomposing the prime factors. Although it can train your brain, it is almost useless to assist in your daily work.

Unfortunately, many training books or even training classes only teach this level of knowledge (or, they can only teach this level of knowledge, for the reason described below).

To apply what you have learned, you must also learn the second level of knowledge:

2. Structured data and its operations

The daily work that you hope to solve through programming is actually to process the data at your hand, and most of these data are those that exist in Excel table or that can be filled in Excel table, such data has a professional term called structured data. You must learn the concept of structured data and common operations such as tables, records, fields, grouping, joining, etc. Only by understanding these can you truly cope with your daily work. Normally, the structured data appears in batches (there are usually many rows of data in the table), therefore, when understanding the structured data, you also need to be familiar with set-related concepts and operations.

To understand what the daily table data processing tasks are, click here: to see details.

Unfortunately, I have to say it again, unlike the situation where there are countless courses focusing on the first level of knowledge, courses for non-professionals rarely involve the second level of knowledge. Probably only those who engage in database development will learn the second level of knowledge on purpose, and maybe only database courses will systematically introduce such knowledge. But the structured data is the basis of daily work, and it is not so difficult for non-professionals to master. For example, do you often use Excel to filter, aggregate or even join (you may not understand this word, what it does is actually what VLOOKUP does), but if you have not learned the structured data systematically, you will be confused when encountering more complicated situations.

Once mastering these two levels of knowledge, non-professionals can really deal with the daily work with ease, and the work efficiency will be improved rapidly.

Having talked about non-professionals, let's briefly talk about those who intend to become professional programmer. There are two groups of people who program based on the above two levels of knowledge (mainly the second level; as the first level is the basis of all programs, it is not necessary to discuss here).

1. Data scientist

Most of the data faced by this group of people is also the structured data. Of course, the more important task of data scientists is to design algorithms and build models. These tasks also use structured data, but the operations involved are not the same as those on table data mentioned above, and generally no longer called structured data operation. However, the workload on algorithms and models usually accounts for 20-30% only, and the data scientists spend most of their time preparing the data, and the preparation work mainly uses the second level of knowledge mentioned above.

Many data scientists are only familiar with algorithms and models, and unfamiliar with the conventional operations on structured data. As a result, the efficiency of preparing data is low (slow in both coding and program execution), resulting in a lot of time delay. Although mastering structured data knowledge does not make your model better, it gives you more time to study algorithms.

2. Programmers developing industry information system

This group of people is probably the most extensive group of programmers. They need to deal with the database every day; especially those who work on the reports and statistics, and what they process is exactly the structured data. Having mastered all structured data knowledge, it will be much easier to design the computing and processing methods. However, this group of people should learn the database knowledge seriously, and naturally they will make up this level of knowledge themselves.

Are there programmers who don't need to master the second level of knowledge?

Yes, surprisingly, those very professional system programmers (working on operating system, network transmission), or the algorithm engineers in specific field (video, audio, etc.) require less knowledge of structured data. This group of people, however, are heavy programmers and will not read this post at all, and thus there's no need to pay attention to them.

Therefore, almost all the people who read this post need these two levels of knowledge.

Once knowing what to learn, it's how to learn. First of all, you should ask yourself a question: which programming language should I learn?

For the first level of knowledge, there should be a lot of choice theoretically, and you can even choose any language because this level of knowledge exists in any programming language. In this case, the only thing you need to do is to find a language that is good at processing structured data, to facilitate learning of the second level of knowledge, and allow you to apply what you've learned.

Although this is true, don't forget the learners are those who know nothing about programming. Therefore, the environment should not be configured too complicated, otherwise they will get confused. For such learners, it's better to learn an out-of-the-box language, and preferably, without installation.

In the early days (more than 30 years ago), all machines were installed with BASIC language. Indeed, BASIC was available directly without installation, but what's unclear is that why BASIC is no longer installed at present, and I don't know if this is a progress or a regression.

Admittedly, there are languages that do not need to be installed at present, mainly including two languages: one is JavaScript that comes with browser, and the other is VBA that comes with Excel (available in all Office components). However, JavaScript's functions are relatively specific, and even if you learn it, it doesn't help much, and hence it's meaningless; VBA is much better, but its ability to process structured data is limited. Moreover, the point is that although these two languages are installation-free, when you want to use them, you have to understand many concepts inside the browser and Excel (called objects in professional terms), which are more difficult than the program logic itself, therefore, they are not suitable for beginners.

Sometimes I miss the BASIC language back then.

How about Python with training courses everywhere? Indeed, it looks attractive.

If you only want to use Python to learn the first level of knowledge, basically it is not a big problem. It is not difficult to install only the basic function package, and coding and running in its development environment is not a problem as well.

But if you want to learn structured data in Python, for most non-professionals, I can tell you for sure: you will never learn! Let alone apply what you've learned.

There are three reasons:

1. To use Python to process structured data, an open-source package called pandas is required. The pandas has not been directly packaged into the installation package of Python, you have to download and install it yourself. The trouble is that the installation process is not simple since you have to configure a lot of things that make you completely confused. Of course, you can also install with the help of some third-party programs, but the installation of such third-party program itself is a trouble. When the program starts up, many engineering environment (originally designed for large applications) need to be configured, by which you will be overwhelmed.

2. The key problem is that pandas is not originally designed for structured data, and is not the table (the set consists of many rows of data) that we are familiar with, but a matrix. If using Python to process structured data, it can perform some simple operations such as filtering and merging, but when it comes to more complex operations like grouping and ordering, it will confuse the beginners. Moreover, its design is not consistent, for example, there are many kinds of sets, each with a different syntax. If you want to memorize them, it is basically by rote. To make it clear, we have to look for examples everywhere as it is difficult to draw inferences from one instance. Click here: to learn more on the reason why Python is not suitable for beginners. In the table of contents of previously recommended weblink, you can find quite a number of problems that are difficult to code in Python.

3. The last reason is debugging. Since you can't code right in one go, the debugging is also a particularly useful tool for learning programming languages. The debugging function of Python development environment is not very good in the first place, and the pandas is not the native component of python, therefore, debugging is more difficult.

Python is not a language designed for non-professionals at all. For non-professionals, the power and convenience of Python only exist in training courses. Training institutions first trick people into learning Python, and then the class ends after learning the first level of knowledge. Those who can use Python to do some work in their daily work are basically the professionals, and you can rarely see someone around you using Python to process Excel (even if processing in Python, they are processing simple format Excel in a simple way). The real users of Python are the heavy professionals (mainly those who do AI).

Learning Python does not help, what else?

Java or C/C++? It is definitely not recommended for the following reasons: i)the object-oriented itself is something advanced, and is not suitable for beginners to understand; ii) the ability to process structured data is almost zero, and it is useless even if you learn; iii) the development environment is very complex, after all, it is used for professionals to develop large software, and thus complexity makes sense.

It's unimaginably that some middle schools are teaching Java as a computer introductory course.

What about SQL? In fact, using SQL to process the structured data is a good choice. SQL can be said to be an exception in this regard. The reason is that you can skip to the second level of knowledge without learning the first level (that's why we said"almost"instead of"all" earlier), and you can do a fairly complicated query without understanding the concepts like variables, loops.

However, the problem with SQL is that basically it can only run in the database, and the installation and configuration of a database is a professional task, this will get you in trouble. Moreover, even if you learn it, when you want to deal with Excel files in the future, you have to import the data into the database first, which is annoying.

Having said so much, all the languages mentioned are not suitable, is there no suitable one?

To be honest, if you only look at these mainstream programming languages, this is really the case because they are so difficult to learn, while the simple languages are useless.

For this reason, although the programming concept is hyped hot, the non-professional complete novice cannot learn and apply them all the time, resulting in people constantly asking what to learn. In this case, it is useless to open more training courses, because the training courses do not have the ability to improve or invent programming languages. Essentially, it is a problem about whether there exists clay, having nothing to do with how to make bricks.

 

SPL is probably the only programming language suitable for a non-professional complete novice in this world.

As a programming language, SPL has a complete basic program logic (the first level of knowledge), and it has a full name "structured process language", is originally invented to process structured data. Due to strong capability in structured data processing, it can be said that it has the most complete capability in this regard among current programming languages, far exceeding Python and SQL. It is not a problem to code in SPL to perform operations such as grouping, joining, ordering, even big data (Python is helpless when it comes to big data; coding in SQL to perform order-related operation will make you completely confused). Moreover, SPL has the following characteristics: the well-designed syntax is easy to learn and understand; directly calculate the data in Excel file, and execute SQL on the file (SPL provides most of the capabilities of SQL, and hence learning SQL makes you avoid installing a database); one-click installation, featured cell-style code makes it easy to debug, etc., and more importantly, it's open source and free!

However, SPL is not specially invented for beginners, but to solve the difficult-to-code and slow running problems in SQL. But when you really use it, you will find that its system is simple and easy to use, and very suitable for beginners to learn programming. More importantly, learning SPL can really put it into practice.

Click here for SPL programming book at: , where you can find some practical code examples, you can also click the first recommended weblink in this article at: [esProc Desktop and Excel processing (2021)](https://c.scudata.com/article/1617693922993) to find more.

After learning SPL, most of the daily work is not a problem, and you can find many examples of using SPL to process daily data tasks in Raqforum.

If you want to take more control over Excel, you can learn some VBA knowledge, after all, VBA is the native language of Excel, and some capacities of VBA are beyond the reach of external programming languages anyway. You can use VBA to call SPL to make up its lack of processing structured data.

To this point, it's enough for non-professionals.

Finally, let's talk about the possible development direction of professionals:

1. Continue to use SPL.

SPL is also suitable for the calculations in industry application software (most calculations are actually related to structured data. For such complex operations, it is inconvenient to code in SQL, and sometimes there is no SQL available), as well as the big data computing (it requires learning a lot of high-performance algorithms, it's really a little difficult).

2. Python

Python is a language for professionals, which has a large number of AI algorithm libraries. Basically, no other language is comparable to Python in this regard except for expensive MATLAB or SAS. If you are determined to be a data scientist, you probably need to learn Python at present.

3. SQL
Database development is still inseparable from SQL. Although its computing power is not strong enough, it has good compatibility and exists everywhere. It is so easy to understand the operations of SQL after learning SPL.

4. JavaC#
At present, most of the industry applications are developed in Java and C# (more in Java). In this case, it is necessary to understand something about object-oriented, which is not in SPL, and needs to be studied carefully.

5. JavaScript
The development inside the browser mainly depends on JavaScript. Its syntax is very simple, and the difficulty is to understand many mechanisms of the browser and web itself.

6. C/C++
This is a language used by senior programmers to do the system-level development, it is far beyond the topic of this post.