0
votes

I have 45 directories in my drive with naming format Sub1,Sub2..Sub45 each consists of more than 300 text files, all text files in each directory have same naming format

regional_vol_GM1.txt
regional_vol_GM2.txt
regional_vol_GM*.txt 

I would like to sort the directories and text files in each directory in sequential order and export the data in each file into csv file,

Following is the script that i have written

    eval "dirs=($(ls -v --quoting-style=shell-always))"
for dir in "${dirs[@]}"; do
  eval "files=($(
    ls -vd --quoting-style=shell-always -- "$dir"/t1/regional_vol*.txt))"
  tail -q -n 1 -- "${files[@]}" | paste -sd , -
done > data.csv

Now i would like to remodel my output csv file with file name of text file as row value and directory name as column, since every directory has 300 text files with same naming format, i just need one single row with file name as header and directory name as column in csv file

1

1 Answers

2
votes

there's a / in x and thus in your expression. Change your sed separator by something not likely to occur in x, like:

sed -i "1s#^#${x}\n#" ${x}

and to change "in-place", just enable the -i option (if not available in your system, use a temp file and move back to the original file)

Now for your file sort: the problem is that wildcard matching or even ls sorts the files but using alphabetical order so regional_vol_GM2.txt comes after regional_vol_GM100.txt.

So even if it's a bit of a hack you could replace this:

tail -q -n 1 "$dir"/t1/regional_vol*.txt

by this:

tail -q -n 1 (cd "$dir"/t1;ls -C1 regional_vol_GM*.txt | sort -k2 -tM -n)

Why it works:

  • I'm using the numerical mode of sort, using second field, delimited by M (the digits come after _GM).

Why it's a hack:

  • it relies on the output of ls which is generally frowned upon. Here it's a simple ls on 1 column, no spaces in your names, should be OK
  • it has to perform a cd just in case there's a M in the directory path and the sort would find the wrong field

What you should do to simply fix that:

  • you should generate your files/ask the people who do to do it with zero padding: 1 becomes 001, 2 becomes 002, etc. so alphanumerical sorting works, no need to do the complex sort hack.