I'm doing a pretty straightforward regex in python and seeing some odd behavior when I use the "or" operator.
I am trying to parse the following:
>> str = "blah [in brackets] stuff"
so that it returns:
>> ['blah', 'in brackets', 'stuff']
To match the text between brackets, I am using look behind and look ahead, i.e.:
>> '(?<=\[).*?(?=\])'
If used alone this does indeed capture the text in brackets:
>> re.findall( '(?<=\[).*?(?=\])' , str )
>> ['in brackets']
But when I combine the or operator to parse the strings between spaces, the bracket-match somehow breaks down:
>> [x for x in re.findall( '(?<=\[).*?(?=\])|.*?[, ]' , str ) if x!=' ' ]
>> ['blah', '[in ', 'brackets] ']
For the life of me I can't understand this behavior. Any help would be appreciated.
Thanks!
[in brackets] stuff
. The first half of the regex doesn't match here because the lookbehind doesn't find an opening bracket. So the 2nd half of the regex matches again and finds the text "[in ". – Aran-Fey