Get a Certain String from Each Row
【Question】
From a text file with variable number of columns per row (tab delimited), I would like to extract value with specific condition. The text file looks like:
S1=dhs Sb=skf S3=ghw QS=ghr<b/>
S1=dhf QS=thg S3=eiq<b/>
QS=bhf S3=ruq Gq=qpq GW=tut<b/>
Sb=ruw QS=ooe Gq=qfj GW=uvd<b/>
I would like to have a result like:
QS=ghr<b/>
QS=thg
QS=bhf
QS=ooe
Please excuse my naive question but I am a beginner trying to learn some basic bash scripting technique for text manipulation.
【Answer】
Shell can do this for you. But the code is hard to read. It’s easy to get this done with SPL’s (Structured Process Language) set-based operations. Here’s the SPL script:
A |
|
1 |
=file("/file.txt").import() |
2 |
=A1.(~.array()).union() |
3 |
=A2.select(pos(~,"QS")) |
A1: Import the text file;
A2: Get field values of each record to form a sequence and union the sequences;
A3: Get from each sequence the member matching string “QS”.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL