Lookaheads do not consume any characters. It just checks if the lookahead can be matched or not:
a(?!b)c
So here after matching a
it just checks if it is followed not by b
but does not consume that not
character (which is c
) and is followed by c
.
How a(?!b)c
matches ac
ac
|
a
ac
|
(?!b) #checks but does not consume. Pointer remains at c
ac
|
c
Positive lookahead
The positive lookahead is similar in that it tries to match the pattern in the lookahead. If it can be matched, then the regex engine proceeds with matching the rest of the pattern. If it cannot, the match is discarded.
E.g.
abc(?=123)\d+
matching abc123
abc123
|
a
abc123
|
b
abc123
c
abc123 #Tries to match 123; since is successful, the pointer remains at c
|
(?=123)
abc123 # Match is success. Further matching of patterns (if any) would proceed from this position
|
abc123
|
\d
abc123
|
\d
abc123 #Reaches the end of input. The pattern is matched completely. Returns a successfull match by the regex engine
|
\d
[A-Za-z]+(?![A-Za-z])
,[^sword]fish
,(?!sword)fish
– Unihedron