"【 Question 】 I have a large number of files with the same, tab-delimited format: Column A & .."

blackduckie RaqForum 28 No.
489 View • 5 Years ago

Compile a Certain Column from a File Group into a Single File

text(125)

【Question】

I have a large number of files with the same, tab-delimited format:

Column A Column B

Data_A1 Data_B1

Data_A2 Data_B2

Data_A3 Data_B3

These files all have the same number of lines.

I want to compile every file's Column B into a single tab-delimited file. Right now, my best plan is to write a Perl script along these lines:

#!/usr/bin/perl

my $file = shift @ARGV;

my $ref = shift @ARGV;

open (FILE, $file); # FILE WITH FORMAT DESCRIBED ABOVE

while (<FILE>) {

chomp;

my @a = split("\t", $_);

push(@B, $a[1]);

}

close FILE;

my $counter = 0;

open (REF, $ref); # TAB-DELIMITED COMPILATION OF EVERY FILE'S COLUMN B

while (<REF>) {

chomp;

print "$_\t$B[$counter]\n";

}

close REF;

Then, write a BASH script that loops through all the files and save the output of the Perl script as its input for the next iteration of the Shell loop:

<!\-\- /\* Font Definitions */ @font-face {font-family:宋体; panose-1:2 1 6 0 3 1 1 1 1 1; mso-font-alt:SimSun; mso-font-charset:134; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:3 680460288 22 0 262145 0;} @font-face {font-family:宋体; panose-1:2 1 6 0 3 1 1 1 1 1; mso-font-alt:SimSun; mso-font-charset:134; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:3 680460288 22 0 262145 0;} @font-face {font-family:Consolas; panose-1:2 11 6 9 2 2 4 3 2 4; mso-font-charset:0; mso-generic-font-family:modern; mso-font-pitch:fixed; mso-font-signature:-536869121 64767 1 0 415 0;} @font-face {font-family:"\\@宋体"; panose-1:2 1 6 0 3 1 1 1 1 1; mso-font-charset:134; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:3 680460288 22 0 262145 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:""; margin:0cm; margin-bottom:.0001pt; text-align:justify; text-justify:inter-ideograph; mso-pagination:none; font-size:12.0pt; mso-bidi-font-size:10.0pt; font-family:"Times New Roman","serif"; mso-fareast-font-family:宋体; mso-font-kerning:1.0pt;} .MsoChpDefault {mso-style-type:export-only; mso-default-props:yes; font-size:10.0pt; mso-ansi-font-size:10.0pt; mso-bidi-font-size:10.0pt; mso-ascii-font-family:"Times New Roman"; mso-fareast-font-family:宋体; mso-hansi-font-family:"Times New Roman"; mso-font-kerning:0pt;} /* Page Definitions */ @page {mso-page-border-surround-header:no; mso-page-border-surround-footer:no;} @page WordSection1 {size:612.0pt 792.0pt; margin:72.0pt 90.0pt 72.0pt 90.0pt; mso-header-margin:36.0pt; mso-footer-margin:36.0pt; mso-paper-source:0;} div.WordSection1 {page:WordSection1;} -->

#!/bin/bash

for file in *.txt

perl Script.pl $file Infile \> Temp

mv Temp Infile

done

But this feels like a huge amount of work for something so simple. Is there a simple Unix command that can do the same thing?

Expected Output:

File1_Column_B File2_Column_B File3_Column_B ...

Data_B1 Data_B1 Data_B1 ...

Data_B2 Data_B2 Data_B2 ...

Data_B3 Data_B3 Data_B3 ...

...

【Answer】

Your question involves order-based operations, especially when you compile field names into the specified format. Here we handle this with SPL (Structured Process Language). Below is SPL script:

	A
1	=directory@p("data")
2	=A1.(file(~).import(Column_B))
3	=join@p(${A2.("A2("+string(#)+")").concat(";")})
4	=file("/result.txt").export@t(A3,${A1.(filename@n(~)).("_"+string(#)+":"+~+"_Column_B").concat@c()})

A1: Read the files.

A2: Import Column_B from each file.

A3: Join up these Column_Bs.

A4: Write A3’s result to a target file and rename field names.

SPL Official Website 👉 https://www.scudata.com

SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL

SPL Learning Material 👉 https://c.scudata.com

SPL Source Code and Package 👉 https://github.com/SPLWare/esProc

Discord 👉 https://discord.gg/2bkGwqTj

Youtube 👉 https://www.youtube.com/@esProc_SPL

text(125)

Application

blackduckie • 489 View • 5 Years ago

Compile a Certain Column from a File Group into a Single File

【Question】

ToC