9
votes

is it possible to change multiply patterns to different values at the same command? lets say I have

A B C D ABC

and I want to change every A to 1 every B to 2 and every C to 3

so the output will be

1 2 3 D 123

since I have 3 patterns to change I would like to avoid substitute them separately. I thought there would be something like

sed -r s/'(A|B|C)'/(1|2|3)/ 

but of course this just replace A or B or C to (1|2|3). I should just mention that my real patterns are more complicated than that...

thank you!

3
Why not sed 's/A/1/g;s/B/2/g;s/C/3/g' file? - anubhava
If its single letters you can just use tr tr 'ABC' '123' - user4453924
what is the (unwrited) constraint to avoid several s/// especially on complex pattern like @anubhava ask ? - NeronLeVelu
question is not exactly the same as the link to the duplicate. The linked is a sub case of this, only some specific simple pattern to replace by uniq new pattern where this question is more genering in search and replace pattern - NeronLeVelu
If you need "words" you should use post an example that uses "words", not just letters as letters are MUCH simpler to do (tr) and the right way to handle "words" really depends on what a "word" means to you and/or what the separators can be between the "words". As written right now your question is extremely likely to produce a solution that works for your posted input but will fail (possibly quietly and/or cryptically and/or disastrously) later when run against some different input. - Ed Morton

3 Answers

16
votes

Easy in sed:

sed 's/WORD1/NEW_WORD1/g;s/WORD2/NEW_WORD2/g;s/WORD3/NEW_WORD3/g'

You can separate multiple commands on the same line by a ;


Update

Probably this was too easy. NeronLeVelu pointed out that the above command can lead to unwanted results because the second substitution might even touch results of the first substitution (and so on).

If you care about this you can avoid this side effect with the t command. The t command branches to the end of the script, but only if a substitution did happen:

sed 's/WORD1/NEW_WORD1/g;t;s/WORD2/NEW_WORD2/g;t;s/WORD3/NEW_WORD3/g'  
2
votes

Easy in Perl:

perl -pe '%h = (A => 1, B => 2, C => 3); s/(A|B|C)/$h{$1}/g'

If you use more complex patterns, put the more specific ones before the more general ones in the alternative list. Sorting by length might be enough:

perl -pe 'BEGIN { %h = (A => 1, AA => 2, AAA => 3);
              $re = join "|", sort { length $b <=> length $a } keys %h; }
          s/($re)/$h{$1}/g'

To add word or line boundaries, just change the pattern to

/\b($re)\b/
# or
/^($re)$/
# resp.
2
votes

This will work if your "words" don't contain RE metachars (. * ? etc.):

$ cat file
there is the problem when the foo is closed

$ cat tst.awk
BEGIN {
    split("the a foo bar",tmp)
    for (i=1;i in tmp;i+=2) {
        old = (i>1 ? old "|" : "\\<(") tmp[i]
        map[tmp[i]] = tmp[i+1]
    }
    old = old ")\\>"
}
{
    head = ""
    tail = $0
    while ( match(tail,old) ) {
        head = head substr(tail,1,RSTART-1) map[substr(tail,RSTART,RLENGTH)]
        tail = substr(tail,RSTART+RLENGTH)
    }
    print head tail
}

$ awk -f tst.awk file
there is a problem when a bar is closed

The above obviously maps "the" to "a" and "foo" to "bar" and uses GNU awk for word boundaries.

If your "words" do contain RE metachars etc. then you need a string-based solution using index() instead of an RE based one using match() (note that sed ONLY supports REs, not strings).