9
votes

I have a portion of a bash script that is getting a filename without extension, but I'm trying to understand what is really going on here. What are the "%%"'s for? Can someone elaborate on what bash is doing behind the scenes? How can this technique be used on a general basis?

#!/bin/bash

for src in *.tif
    do
    txt=${src%%.*}
    tesseract ${src} ${txt}
    done
5
Take note that %% will remove all of the '.' in the filename, not just the so called extension. For instance if you had hello.world.tif and hello.death.tif, both would have tesseract sending them to the same destination, hello. If you want to remove just an extension, use %.johnny
Because in the case of multiple "."s in the file name you want to search for and strip the "shortest matching pattern" from the right side of the string. Makes sense to me. So in the case of my script above I should change it to "%". This is supposed to strip filename extensions.jjclarkson

5 Answers

15
votes

It gets rid of the filename extension (here: .tif), sample:

$ for A in test.py test.sh test.xml test.xsl; do echo "$A: ${A%%.*}"; done
test.py: test
test.sh: test
test.xml: test
test.xsl: test

from bash manual:

   ${parameter%%word}
          The word is expanded to produce a pattern just as in pathname expansion.  If the
          pattern matches a trailing portion of the expanded value of parameter, then  the
          result  of  the  expansion  is the expanded value of parameter with the shortest
          matching pattern (the ``%'' case) or the longest matching  pattern  (the  ``%%''
          case) deleted.  If parameter is @ or *, the pattern removal operation is applied
          to each positional parameter in turn, and the expansion is the  resultant  list.
          If  parameter  is an array variable subscripted with @ or *, the pattern removal
          operation is applied to each member of the array in turn, and the  expansion  is
          the resultant list.
4
votes

Here's output from the bash man page

 ${parameter%%word}
          The word is expanded to produce a pattern just  as  in  pathname
          expansion.   If  the  pattern  matches a trailing portion of the
          expanded value of parameter, then the result of the expansion is
          the  expanded value of parameter with the shortest matching pat-
          tern (the ``%'' case)  or  the  longest  matching  pattern  (the
          ``%%''  case)  deleted.   If  parameter  is  @ or *, the pattern
          removal operation is applied to  each  positional  parameter  in
          turn,  and the expansion is the resultant list.  If parameter is
          an array variable subscripted with @ or *, the  pattern  removal
          operation  is  applied  to each member of the array in turn, and
          the expansion is the resultant list.
3
votes

Apparently bash has several "Parameter Expansion" tools which include:

Simply substituting the value...

${parameter}

Expanding to a sub-string...

${parameter:offset}
${parameter:offset:length}

Substitute the length of the parameters value...

${#parameter}

Expanding upon a match at the beginning of the parameter...

${parameter#word}
${parameter##word}

Expanding upon a match at the end of the parameter...

${parameter%word}
${parameter%%word}

Expands the parameter to find and replace a string...

${parameter/pattern/string}

These are my interpretation of the parts I think I understand from this section of the man pages. Let me know if I missed something important.

1
votes

Check out "Parameter Expansion" in the bash man pages. That syntax expands the $src variable deleting stuff that matches the .* pattern from it.

1
votes

It's a string removal operation in the format: ${str%%substr}

Where str is the string you are operating on, and substr is the pattern to match. It looks for the longest match of substr in str, and removes everything from that point on.