0
votes

Originally, I have only one file to read with some conditions. Here is the code...

awk -vid="$name" -F',' 'BEGIN{counter=0;}{if($15=="true"){counter++}}END{print id,counter}' file1

This code worked properly.

But now, I have to read on 2 files. The only difference between the 2 files is that file2 has 1 extra column. file1 has 16 columns and file2 have 17 columns. There will be an instance that a row in file1 will exist in file2(plus the extra column)

POSSIBLE cases:

  1. a row in file1 will exist in file2(disregarding extra column)
  2. a row in file1 is different to file2
  3. a row in file1 and file2 is similar except for 15th column(true or false as seen in the condition in the code above).

Now, my problem is that if I add file2 to the code above, like this...

awk -vid="$name" -F',' 'BEGIN{counter=0;}{if($15=="true"){counter++}}END{print id,counter}' file1 file2

awk will count the same entry twice if that entry exists in both files.

Question: Is there any way to check the duplication aside from merging file1 and file2?

1
Please post some samples from both files and the expected output. - James Brown

1 Answers

0
votes

You may want to say something like:

awk -vid="$name" -F',' 'BEGIN{counter=0;}{if($15=="true" && !done[$1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14]++){counter++}}END{print id,counter}' file1 file2

BTW the fragment "END{print name,counter}" in the original code shoud be a typo of "END{print id,counter}", isn't it?