170
votes

I found that in 123, \d matches 1 and 3 but not 2. I was wondering if \d matches a digit satisfying what kind of requirement? I am talking about Python style regex.

Regular expression plugin in Gedit is using Python style regex. I created a text file with its content being

123

Only 1 and 3 are matched by the regex \d; 2 is not.

Generally for a sequence of digit numbers without other characters in between, only the odd order digits are matches, and the even order digits are not. For example in 12345, the matches are 1, 3 and 5.

6
\d will match 1, 2 and 3. If it doesn't there must be something else in your expression. Can you show your full expression?Alex Aza
\d is shorthand for [0-9], so it ought to match 2. Please post a complete test case (a script that can be run, which demonstrates your problem) and maybe we can figure out what's wrong.zwol
@delnan: "I found that in 123, \d matches 1 and 3 but not 2" sounds pretty concrete to me.Amber
@Amber: Damn me, I missed the not!user395760
Okay, I'm not posting this as an answer because I don't know, but I think what's going on is gedit refuses to start a new match immediately after the end of the previous match -- it skips one character, whatever it is, before trying to match again. Please try matching 11111 and 22222.zwol

6 Answers

503
votes

[0-9] is not always equivalent to \d. In python3, [0-9] matches only 0123456789 characters, while \d matches [0-9] and other digit characters, for example Eastern Arabic numerals ٠١٢٣٤٥٦٧٨٩.

16
votes

\d matches any single digit in most regex grammar styles, including python. Regex Reference

12
votes

In Python-style regex, \d matches any individual digit. If you're seeing something that doesn't seem to do that, please provide the full regex you're using, as opposed to just describing that one particular symbol.

>>> import re
>>> re.match(r'\d', '3')
<_sre.SRE_Match object at 0x02155B80>
>>> re.match(r'\d', '2')
<_sre.SRE_Match object at 0x02155BB8>
>>> re.match(r'\d', '1')
<_sre.SRE_Match object at 0x02155B80>
9
votes

\\d{3} matches any sequence of three digits in Java.

7
votes

This is just a guess, but I think your editor actually matches every single digit — 1 2 3 — but only odd matches are highlighted, to distinguish it from the case when the whole 123 string is matched.

Most regex consoles highlight contiguous matches with different colors, but due to the plugin settings, terminal limitations or for some other reason, only every other group might be highlighted in your case.

1
votes

Info regarding .NET / C#:

Decimal digit character: \d \d matches any decimal digit. It is equivalent to the \p{Nd} regular expression pattern, which includes the standard decimal digits 0-9 as well as the decimal digits of a number of other character sets.

If ECMAScript-compliant behavior is specified, \d is equivalent to [0-9]. For information on ECMAScript regular expressions, see the "ECMAScript Matching Behavior" section in Regular Expression Options.

Info: https://docs.microsoft.com/en-us/dotnet/standard/base-types/character-classes-in-regular-expressions#decimal-digit-character-d