100
votes

What is the difference between:

(.+?)

and

(.*?)

when I use it in my php preg_match regex?

9

9 Answers

162
votes

They are called quantifiers.

* 0 or more of the preceding expression

+ 1 or more of the preceding expression

Per default a quantifier is greedy, that means it matches as many characters as possible.

The ? after a quantifier changes the behaviour to make this quantifier "ungreedy", means it will match as little as possible.

Example greedy/ungreedy

For example on the string "abab"

a.*b will match "abab" (preg_match_all will return one match, the "abab")

while a.*?b will match only the starting "ab" (preg_match_all will return two matches, "ab")

You can test your regexes online e.g. on Regexr, see the greedy example here

22
votes

The first (+) is one or more characters. The second (*) is zero or more characters. Both are non-greedy (?) and match anything (.).

9
votes

A + matches one or more instances of the preceding pattern. A * matches zero or more instances of the preceding pattern.

So basically, if you use a + there must be at least one instance of the pattern, if you use * it will still match if there are no instances of it.

9
votes

+ matches at least one character

* matches any number (including 0) of characters

The ? indicates a lazy expression, so it will match as few characters as possible.

8
votes

Consider below is the string to match.

ab

The pattern (ab.*) will return a match for capture group with result of ab

While the pattern (ab.+) will not match and not returning anything.

But if you change the string to following, it will return aba for pattern (ab.+)

aba
6
votes

+ is minimal one, * can be zero as well.

6
votes

In RegEx, {i,f} means "between i to f matches". Let's take a look at the following examples:

  • {3,7} means between 3 to 7 matches
  • {,10} means up to 10 matches with no lower limit (i.e. the low limit is 0)
  • {3,} means at least 3 matches with no upper limit (i.e. the high limit is infinity)
  • {,} means no upper limit or lower limit for the number of matches (i.e. the lower limit is 0 and the upper limit is infinity)
  • {5} means exactly 4

Most good languages contain abbreviations, so does RegEx:

  • + is the shorthand for {1,}
  • * is the shorthand for {,}
  • ? is the shorthand for {,1}

This means + requires at least 1 match while * accepts any number of matches or no matches at all and ? accepts no more than 1 match or zero matches.

Credit: Codecademy.com

4
votes

A star is very similar to a plus, the only difference is that while the plus matches 1 or more of the preceding character/group, the star matches 0 or more.

1
votes

I think the previous answers fail to highlight a simple example:

for example we have an array:

numbers = [5, 15]

The following regex expression ^[0-9]+ matches: 15 only. However, ^[0-9]* matches both 5 and 15. The difference is that the + operator requires at least one duplicate of the preceding regex expression