8
votes

how to extract a number with decimal (dot and comma) from a string (e.g. 1,120.01) ? I have a regex but doesn't seem to play well with commas

preg_match('/([0-9]+\.[0-9]+)/', $s, $matches);
6

6 Answers

28
votes

The correct regex for matching numbers with commas and decimals is as follows (The first two will validate that the number is correctly formatted):


decimal optional (two decimal places)

^[+-]?[0-9]{1,3}(?:,?[0-9]{3})*(?:\.[0-9]{2})?$

Regular expression visualization

Debuggex Demo

Explained:

number (decimal optional)

^[+-]?[0-9]{1,3}(?:,?[0-9]{3})*(?:\.[0-9]{2})?$

Options: case insensitive

Assert position at the beginning of the string «^»
Match a single character present in the list below «[+-]?»
   Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
   The character “+” «+»
   The character “-” «-»
Match a single character in the range between “0” and “9” «[0-9]{1,3}»
   Between one and 3 times, as many times as possible, giving back as needed (greedy) «{1,3}»
Match the regular expression below «(?:,?[0-9]{3})*»
   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
   Match the character “,” literally «,?»
      Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
   Match a single character in the range between “0” and “9” «[0-9]{3}»
      Exactly 3 times «{3}»
Match the regular expression below «(?:\.[0-9]{2})?»
   Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
   Match the character “.” literally «\.»
   Match a single character in the range between “0” and “9” «[0-9]{2}»
      Exactly 2 times «{2}»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»

Will Match:

1,432.01
456.56
654,246.43
432
321,543

Will not Match

454325234.31
324,123.432
,,,312,.32
123,.23

decimal mandatory (two decimal places)

^[+-]?[0-9]{1,3}(?:,?[0-9]{3})*\.[0-9]{2}$

Regular expression visualization

Debuggex Demo

Explained:

number (decimal required)

^[+-]?[0-9]{1,3}(?:,?[0-9]{3})*\.[0-9]{2}$

Options: case insensitive

Assert position at the beginning of the string «^»
Match a single character present in the list below «[+-]?»
   Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
   The character “+” «+»
   The character “-” «-»
Match a single character in the range between “0” and “9” «[0-9]{1,3}»
   Between one and 3 times, as many times as possible, giving back as needed (greedy) «{1,3}»
Match the regular expression below «(?:,?[0-9]{3})*»
   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
   Match the character “,” literally «,?»
      Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
   Match a single character in the range between “0” and “9” «[0-9]{3}»
      Exactly 3 times «{3}»
Match the character “.” literally «\.»
Match a single character in the range between “0” and “9” «[0-9]{2}»
   Exactly 2 times «{2}»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»

Will Match:

1,432.01
456.56
654,246.43
324.75

Will Not Match:

1,43,2.01
456,
654,246
324.7523

Matches Numbers separated by commas or decimals indiscriminately:

^(\d+(.|,))+(\d)+$

Regular expression visualization

Debuggex Demo

Explained:

    Matches Numbers Separated by , or .

^(\d+(.|,))+(\d)+$

Options: case insensitive

Match the regular expression below and capture its match into backreference number 1 «(\d+(.|,))+»
   Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
   Note: You repeated the capturing group itself.  The group will capture only the last iteration.  Put a capturing group around the repeated group to capture all iterations. «+»
   Match a single digit 0..9 «\d+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
   Match the regular expression below and capture its match into backreference number 2 «(.|,)»
      Match either the regular expression below (attempting the next alternative only if this one fails) «.»
         Match any single character that is not a line break character «.»
      Or match regular expression number 2 below (the entire group fails if this one fails to match) «,»
         Match the character “,” literally «,»
Match the regular expression below and capture its match into backreference number 3 «(\d)+»
   Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
   Note: You repeated the capturing group itself.  The group will capture only the last iteration.  Put a capturing group around the repeated group to capture all iterations. «+»
   Match a single digit 0..9 «\d»

Will Match:

1,32.543,2
5456.35,3.2,6.1
2,7
1.6

Will Not Match:

1,.2 // two ., side by side
1234,12345.5467. // ends in a .
,125 // begins in a ,
,.234 // begins in a , and two symbols side by side
123,.1245. // ends in a . and two symbols side by side

Note: wrap either in a group and then just pull the group, let me know if you need more specifics.

Description: This type of RegEx works with any language really (PHP, Python, C, C++, C#, JavaScript, jQuery, etc). These Regular Expressions are good for currency mainly.

4
votes

You can use this regex: -

/((?:[0-9]+,)*[0-9]+(?:\.[0-9]+)?)/

Explanation: -

/(
    (?:[0-9]+,)*   # Match 1 or more repetition of digit followed by a `comma`. 
                   # Zero or more repetition of the above pattern.
    [0-9]+         # Match one or more digits before `.`
    (?:            # A non-capturing group
        \.         # A dot
        [0-9]+     # Digits after `.`
    )?             # Make the fractional part optional.
 )/
3
votes

Add the comma to the range that can be in front of the dot:

/([0-9,]+\.[0-9]+)/
#     ^ Comma

And this regex:

/((?:\d,?)+\d\.[0-9]*)/

Will only match

1,067120.01
121,34,120.01

But not

,,,.01
,,1,.01
12,,,.01

# /(
#   (?:\d,?) Matches a Digit followed by a optional comma
#   +        And at least one or more of the previous
#   \d       Followed by a digit (To prevent it from matching `1234,.123`)
#   \.?      Followed by a (optional) dot
#            in case a fraction is mandatory, remove the `?` in the previous section.
#   [0-9]*   Followed by any number of digits  -->  fraction? replace the `*` with a `+`
# )/
0
votes

The locale-aware float (%f) might be used with sscanf.

$result = sscanf($s, '%f')

That doesn't split the parts into an array though. It simply parses a float.

See also: http://php.net/manual/en/function.sprintf.php

A regex approach:

/([0-9]{1,3}(?:,[0-9]{3})*\.[0-9]+)/
0
votes

This should work

preg_match('/\d{1,3}(,\d{3})*(\.\d+)?/', $s, $matches);
0
votes

Here is a great working regex. This accepts numbers with commas and decimals.

/^-?(?:\d+|\d{1,3}(?:,\d{3})+)?(?:\.\d+)?$/