3
votes

How do you exclude pairs of characters from a regular expression?

I am trying to get a regular expression that will have 5 alphanumeric characters followed by

anything except "XX" and "AD", followed by XX.

So

D22D0ACXX 

will match, but the following two will not match

D22D0ADXX   
D22D0XXXX. 

My first attempt was :

([A-Z0-9]{5}[^(?AD)|(?XX)]XX)

But this treats the character classes part [^(?AD)|(?XX)] as one character, so I end up with the last 8 characters, not all 9.

Can I exclude pairs of characters without getting into back references?

I need to capture the whole group, hence the outer parenthesis. The negative lookahead suggestions don't seem to do this.

2

2 Answers

2
votes

Use negative lookahead:

([A-Z0-9]{5}(?!(AD|XX)XX).{4})
1
votes

Don't treat it as a character class, instead, think of it as an alternation with a negative lookahead, e.g:

([A-Z0-9]{5}(?!(AD|XX)XX))

Then, if you need the tail, include it after the lookhead, e.g:

([A-Z0-9]{5}(?!(AD|XX)XX)[A-Z0-9]{4})