1
votes

Here is my command line:

find . -type f -exec file {} \; \
| sed 's/\(.*png\): .* \([0-9]* x [0-9]*\).*/\2 \1/' \
| sed 's/\(.*jpg\): .* \([0-9]*x[0-9]*\).*/\2 \1/' \
| awk 'int($1) < 1000' \
| sed 's/^.*[[:blank:]]//' \
| tar -czvf images.tar.gz --null -T -

And the error i got is:

tar: Unix\n./test.png\n./test2.jpg\n: Cannot stat: No such file or directory
tar: Exiting with failure status due to previous errors

What i want is to find all images in current directory, who's width less than 1000 px and tar them into archive.

3
Why did you add --null to the tar command? - melpomene
The error message is including \n which almost always is the "new-line" character. Weird that it is in there, nothing in your code seems to be creating it. Are you sure the error message matches the code? Also, you can probably get by with find ... | awk ... | tar ... You can do multiple substitutions in one instance of awk and print/test $2 instead of $1. (and other non-optimal stuff for a later time). Presumably, you built this cmd up 1 addition at a time? If not, go back a add 1 more pipe and study the changes made from the previous. AND why not find -name '*.jpg' -o -name '*png'? - shellter
@shellter: because of --null, tar expects filenames to be \0-separated. - gniourf_gniourf
@gniourf_gniourf : Are you replying to melpomene ? - shellter
@shellter: no, to you :). Good luck ;). - gniourf_gniourf

3 Answers

5
votes

to use --null, you need to convert newlines to nulls first:

...
| tr '\n' '\0' \
| tar -czvf images.tar.gz --null -T -

(tested, working.)

also, here are a number of suggestions on speed and style in decreasing order of importance.

a. don't find and run file on more files than you need to:

find . -type f -iname "*.png" -or -iname "*.jpg"

b. for commands that can run on multiple files per command, such as file, use xargs to save a lot of time:

find . -type f -iname "*.png" -or -iname "*.jpg" -print0 | xargs -0 file

c. if you put | at the end of each line, you can continue on the next line without also using \.

find . -type f -iname "*.png" -or -iname "*.jpg" -print0 |
  xargs -0 file

d. you can save yourself a lot of trouble since your max width is 999 by just greping for 1, 2, or 3 digit widths, though the awk '$1<1000' is ultimately better in case you ever want to use a different threshold:

find . -type f -iname "*.png" -or -iname "*.jpg" -print0 |
  xargs -0 file |
  grep ', [0-9][0-9]\?[0-9]\? x '

e. grep and awk are faster than sed, so use them where possible:

find . -type f -iname "*.png" -or -iname "*.jpg" -print0 |
  xargs -0 file |
  grep ', [0-9][0-9]\?[0-9]\? x ' |
  grep -o -i '.*\.\(png\|jpg\)'

final command:

find . -type f -iname "*.png" -or -iname "*.jpg" -print0 |
  xargs -0 file |
  grep ', [0-9][0-9]\?[0-9]\? x ' |
  grep -o -i '.*\.\(png\|jpg\)' |
  tr '\n' '\0' |
  tar -czvf images.tar.gz --null -T -
2
votes

You can also use awk only with :

find . -type f \( -name "*.png" -or -name "*.jpg" \)  -exec file {} \; | awk -v width_limit=1000 '
    {
        match($0, /,\s+([0-9]+)\s*x\s*([0-9]+)/, items)

        if (items[1] < width_limit){
            match($0, /(.*):/, filename)
            print filename[1]
        }             
    }' | tar -czvf allfiles.tar -T -

The width can be configured with width_limit variable

1
votes

Quick way using perl:

find . -type f -exec file {} + |
    perl -ne '
        print $1."\0" if /^(.*):\s*(JPEG|PNG).*,\s*(\d+)\s+x\s*\d+\s*,/ &&
             $3 < 1000;
        ' | tar -czvf images.tar.gz --null -T -

Using + operator to find as same effect than print0 | xargs -0.