python - Regex matching between two strings?

Question

I can't seem to find a way to extract all comments like in following example.

>>> import re
>>> string = '''
... <!-- one 
... -->
... <!-- two -- -- -->
... <!-- three -->
... '''
>>> m = re.findall ( '<!--([^\(-->)]+)-->', string, re.MULTILINE)
>>> m
[' one \n', ' three ']

block with two -- -- is not matched most likely because of bad regex. Can someone please point me in right direction how to extract matches between two strings.

Hi I've tested what you guys suggested in comments.... here is working solution with little upgrade.

>>> m = re.findall ( '<!--(.*?)-->', string, re.MULTILINE)
>>> m
[' two -- -- ', ' three ']
>>> m = re.findall ( '<!--(.*\n?)-->', string, re.MULTILINE)
>>> m
[' one \n', ' two -- -- ', ' three ']

thanks!

anything between the [] is a single character so (-->) will not look for that grouping is part of the problem... — Joran Beasley
re.findall('', string, re.DOTALL) should do. You don't need ^\(-->) here, because the question mark makes it non-greedy. — BrtH
You look like you're looking for just the words? If so, what's wrong with m = re.findall('[\w]+', string, re.MULTILINE)? Also, string is a really bad name for a, um, string. — Ben

iruvar iruvar · Accepted Answer · 2012-10-04T21:24:10

38

votes

this should do the trick

 m = re.findall ( '<!--(.*?)-->', string, re.DOTALL)

python - Regex matching between two strings?

2 Answers