python - Regex matching 5-digit substrings not enclosed with digits

Question

I want to extract 5 continuous digits from the string

code I have written.

re.findall(r"((\D|^)*)\d\d\d\d\d((\D|$)*)", s)

but it can not pass the string

"Helpdesk-Agenten (m/w) Kennziffer: 12966"

The expected result is:

Example 2:

#input
"Helpdesk-Agenten (m/w) Kennziffer: 12966abc"
# expected
12966

Example 3:

#input
"Helpdesk-Agenten (m/w) Kennziffer: 12966345"
# expected
"" (because the length of continuous digits is longer than 5)

maybe you could provide more examples of matches? should it match 12345abc? — Jean-François Fabre♦
@Jean-FrançoisFabre thanks for the comment, added two examples — Hello lad
Here is another similar question stackoverflow.com/questions/16348538/… — kasravnd

Wiktor Stribiżew Wiktor Stribiżew · Accepted Answer · 2017-01-23T13:51:11

Your current regex (((\D|^)*)\d\d\d\d\d((\D|$)*)) used with re.findall won't return the digit chunks because they are not captured. More, the (\D|^)* and (\D|$)* parts are optional and that means they do not do what they are supposed to do, the regex will find 5 digit chunks inside longer digits chunks.

If you must find 5 digit chunk not enclosed with other digits, use

re.findall(r"(?<!\d)\d{5}(?!\d)", s)

See the regex demo

Details:

(?<!\d) - no digit is allowed before the current location
\d{5} - 5 digits
(?!\d) - no digit allowed after the current location.

python - Regex matching 5-digit substrings not enclosed with digits

2 Answers