Can you use backreferences in a lookbehind?
Let's say I want to split
wherever behind me a character is repeated twice.
String REGEX1 = "(?<=(.)\\1)"; // DOESN'T WORK!
String REGEX2 = "(?<=(?=(.)\\1)..)"; // WORKS!
System.out.println(java.util.Arrays.toString(
"Bazooka killed the poor aardvark (yummy!)"
.split(REGEX2)
)); // prints "[Bazoo, ka kill, ed the poo, r aa, rdvark (yumm, y!)]"
Using REGEX2
(where the backreference is in a lookahead nested inside a lookbehind) works, but REGEX1
gives this error at run-time:
Look-behind group does not have an obvious maximum length near index 8
(?<=(.)\1)
^
This sort of make sense, I suppose, because in general the backreference can capture a string of any length (if the regex compiler is a bit smarter, though, it could determine that \1
is (.)
in this case, and therefore has a finite length).
So is there a way to use a backreference in a lookbehind?
And if there isn't, can you always work around it using this nested lookahead? Are there other commonly-used techniques?
(?<=\\1)(.)
? – Tim PietzckerPatternSyntaxException
. By the way, if anybody wants to play around with a variant of this problem, I just authored one on codingBat: codingbat.com/prob/p266235 – polygenelubricants