227
votes

How can I replace foobar with foo123bar?

This doesn't work:

>>> re.sub(r'(foo)', r'\1123', 'foobar')
'J3bar'

This works:

>>> re.sub(r'(foo)', r'\1hi', 'foobar')
'foohibar'

I think it's a common issue when having something like \number. Can anyone give me a hint on how to handle this?

1
This question has been added to the Stack Overflow Regular Expression FAQ, under "Groups". - aliteralmind
this question took me quite a long time to find, because it doesn't feature the terms 'capture group' or 'numbered group reference', but I'm here eventually and glad you asked it. - Mark Ch
Your issue is that r'\112' is getting interpreted as the octal literal 0112, ASCII'J', or decimal 74. Can't see how to force the backreference '\1' to get evaluated before string concatenation or ''.join() - smci
a small deviation from the question, any way to refer all group matches i.e. r'\<for all matches>hi'? - user11370656

1 Answers

379
votes

The answer is:

re.sub(r'(foo)', r'\g<1>123', 'foobar')

Relevant excerpt from the docs:

In addition to character escapes and backreferences as described above, \g will use the substring matched by the group named name, as defined by the (?P...) syntax. \g uses the corresponding group number; \g<2> is therefore equivalent to \2, but isn’t ambiguous in a replacement such as \g<2>0. \20 would be interpreted as a reference to group 20, not a reference to group 2 followed by the literal character '0'. The backreference \g<0> substitutes in the entire substring matched by the RE.