3
votes

Let's say I want to replace any number of repeats of string 1 with an equal number of repeats of string 2, using regular expressions. For example, string 1 = "apple", string 2 = "orange".

I imagine something like this:

s/apple{2,}/orange{N}/

but I don't know how to specify the N to match the number of repeats of apple. Is that even possible?

Note: as pointed out by xhienne, I am looking for repeats, therefore at least two occurrences of the string 1.

Sample input:

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. apple Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. appleappleapple Excepteur sint occaecat cupidatat non proident, appleappleappleapple sunt in culpa qui officia deserunt mollit anim id est laborum.

Sample output:

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. apple Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. orangeorangeorange Excepteur sint occaecat cupidatat non proident, orangeorangeorangeorange sunt in culpa qui officia deserunt mollit anim id est laborum.

2
Can you provide some test cases? - vrintle
Why not just s/apple/orange/g? - melpomene
Judging from your example, there seems to be a constraint you had not stated explicitly: is this really "any number of repeats" or "any number of two or more repeats"? The former can be simplified, like melpomene said. The latter is much more complex. - xhienne
@plusik Regex implementations differ from engine to engine, you should always add your tool/language. If you plan to use it in Vim, add this tag. Now, what is the real use case? Provide a sample text and expected output. - Wiktor Stribiżew
@WiktorStribiżew I am aware that the regex syntax differs from engine to engine, but that is irrelevant. I am asking whether this is possible in principle. If you can provide a solution in any regex engine, I'll be happy. Added a sample input and output as requested. - plusik

2 Answers

3
votes

A possible solution is using a regex that supports \G operator:

(?:\G(?!\A)|(?=(?:apple){2}))apple

See the regex demo

Details

  • (?:\G(?!\A)|(?=(?:apple){2})) - a non-capturing group that matches either of the two alternatives:
    • \G(?!\A) - the end of the previous successful match (with the start of string position subtracted from the \G)
    • | - or
    • (?=(?:apple){2}) - a location in string that is followed with two occurrences of apple substring
  • apple - an apple substring.

Note that the regex does not need to count much, it just finds a place where a string repeats 2 times, then, it replaces all consecutive, adjoining matches.

1
votes

Since this problem initially arose while you were using vim (which doesn't support the \G operator used by Wiktor Stribiżew in his answer), here is an answer specifically for vim:

:s/\(apple\)\{2,\}/\= substitute(submatch(0), "apple", "orange", "g")/g

(of course, this cannot be considered as a true regex since it makes use of a vim function to do a sub-substitution in the matched text)