python - Splitting a string with repeated characters into a list

Question

I am not well experienced with Regex but I have been reading a lot about it. Assume there's a string s = '111234' I want a list with the string split into L = ['111', '2', '3', '4']. My approach was to make a group checking if it's a digit or not and then check for a repetition of the group. Something like this

L = re.findall('\d[\1+]', s)

I think that \d[\1+] will basically check for either "digit" or "digit +" the same repetitions. I think this might do what I want.

@thefourtheye : No assume that it will contain non-digits as well — Mathews_M_J
I have impression that you were looking for r_e = "(1*)(2*)(3*)(4*)" that gives re.findall(r_e, s)[0] => ('111', '2', '3', '4'). — Grijesh Chauhan
Through list is ordered collection: If you don't need order then you can use r_e = "((?P<o>1+)|(?P<to>2+)|(?P<th>3+)|(?P<f>4+))*" then re.search(r_e, s).group('o', 'to', 'th', 'f') — Grijesh Chauhan

devnull devnull · Accepted Answer · 2014-04-05T16:03:08

Use re.finditer():

>>> s='111234'
>>> [m.group(0) for m in re.finditer(r"(\d)\1*", s)]
['111', '2', '3', '4']

python - Splitting a string with repeated characters into a list

3 Answers