How to Split Numbers Away from a String and Sort Them
Usually we sort a column of text data by comparing whole values. At times values consist of letters and numbers, like names plus numbers shown below:
Mike105
Rose11
Rose1
Rose5
Mike6
And we need to sort the column by names first and then by numbers. So, in the following expected result, Mike6 appears before Mike105:
Mike6
Mike105
Rose1
Rose5
Rose11
To write a program directly based on the text data, we need to first split the data and then concatenate and sort it. It’s inconvenient.
When data is stored in the database, we usually split it into the temporary table TData using the stored procedure and perform the sort in SQL. For example, SQL Server2012 will first define a user-defined scalar valued function dbo.words for splitting data:
CREATE FUNCTION dbo.words(@word VARCHAR(100) , @part INT)RETURNS VARCHAR(100)
AS
BEGIN
DECLARE @sub VARCHAR(100)
DECLARE @tmp VARCHAR(10)
DECLARE @numIndex INT
DECLARE @len INT
DECLARE @i INT
SET @i = 1
SET @len = Len(@word)
WHILE @i<=@len
BEGIN
SET @tmp = Substring(@word,@i,1)
IF(ISNUMERIC(@tmp)=1)
BEGIN
SET @numIndex = @i
BREAK
END
SET @i=@i+1
END
IF(@part=1)
BEGIN
SET @sub=Substring(@word,1,@numIndex-1)
END
ELSE
BEGIN
SET @sub=Substring(@word,@numIndex,@len-@numIndex+1)
END
RETURN @sub
END
In the above code parameter @part specifies the characters to be split. 1 represents the name part and 2 means the number part.
Note that the split-away numbers should be first converted into integers before they are sorted. Otherwise the result would be the same as that of sorting the strings as a whole. Perform the sort using the following SQL:
SELECT * FROM TData ORDER BY dbo.words(STR,1),cast(dbo.words(STR,2) as int)
This method of creating a user-defined database function to split data, however, isn’t universal and can’t be migrated to other databases.
The computation would be made rather convenient if we could use esProc SPL to handle it. Only a one-liner is needed:
=file("d:/data.txt").import().sort(#1.words@w())
esProc is a computing middleware. It retrieves data from a data source and then handles data in a universal logic. This makes the code capable of migrating. You can use a SPL script to sort, group, filter a structured text file and join text files. More examples can be found in Structured Text Computing with esProc.
SPL is integration-friendly with a Java program. Read How to Call an SPL Script in Java to learn details.
About how to work with esProc, read Getting Started with esProc.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL