1
votes

Here is an example of markdown text:

# "My title"

!Image caption.{ style="float:right; width: 20%; border: 1px"}

Some "quoted text", some *"emphasized text"*, etc.

In a bash script, I'm trying to replace any double quotes with french quotes.

For example: "word" should become « word ».

In other words, all the quotes before a word should be replaced with an opened french quote followed by a non-breaking space; and all the quotes after a word should be replaced with a non-breaking space followed by a closed french quote; EXCEPT when the quotes are inside of curly brackets.

So the previous markdown text should be converted like this:

# « My title »

!Image caption.{ style="float:right; width: 20%; border: 1px"}

Some « quoted text », some *« emphasized text »*, etc.

What I've tried

I currently use the following regex in my script:

" Replace "word by « word
sed -i -Ee "/(^|\s|\(|\[)\"/ s//\1« /g" myfile.md
" Replace word" by word »
sed -i -Ee "/(\S)\"/ s//\1 »/g" myfile.md

Of course, the problem is that it replaces all the quotes, even inside of curly brackets.

So my question is: which regex could replace double quotes with french quotes, except inside of curly brackets ?

1
Better use a programming language where the engine supports e.g. (*SKIP)(*FAIL) like PHP, PyPi regex, etc. See regex101.com/r/1C9lFg/1 and regex101.com/r/1C9lFg/2 - Jan
You could use (?:"(?=.*?{.*?})|"(?=[^}]*?$)) to select all quotes outside brackets but since frenchquotes are different on the left/right it seems you'd need something more sofisticated. - JvdV

1 Answers

3
votes

With awk:

awk -F "" '
{
  for (i=1;i<=NF;i++){                    # loop over all fields/characters in line
    if ($i=="{") brace++                  # increment counter if `{` found
    if ($i=="}") brace--                  # decrement counter if `}` found
    if (brace==0 && $i=="\""){            # if counter is zero and char is a quote
      printf "%s", (cquote ? " »" : "« ") # print closing or opening french quote
      cquote=!cquote                      # toggle flag
      continue                            # continue with next character
    }
    printf "%s", $i                       # print character
  }
  print ""                                # print newline
}' file

Output:

# « My title »

!Image caption.{ style="float:right; width: 20%; border: 1px"}

Some « quoted text », some *« emphasized text »*, etc.