27
votes

Say, I have a string that I need to verify the correct format of; e.g. RR1234566-001 (2 letters, 7 digits, dash, 1 or more digits). I use something like:

        Regex regex = new Regex(patternString);
        if (regex.IsMatch(stringToMatch))
        {
            return true;
        }
        else
        {
            return false;
        }

This works to tell me whether the stringToMatch follows the pattern defined by patternString. What I need though (and I end up extracting these later) are: 123456 and 001 -- i.e. portions of the stringToMatch.

Please note that this is NOT a question about how to construct regular expressions. What I am asking is: "Is there a way to match and extract values simultaneously without having to use a split function later?"

4
Note that you can just return: return regex.IsMatch(...) // code from the question or return match.Success // code from the accepted solution returns in if/else are not needed :) – Frank Sebastià

4 Answers

72
votes

You can use regex groups to accomplish that. For example, this regex:

(\d\d\d)-(\d\d\d\d\d\d\d)

Let's match a telephone number with this regex:

var regex = new Regex(@"(\d\d\d)-(\d\d\d\d\d\d\d)");
var match = regex.Match("123-4567890");
if (match.Success)
    ....

If it matches, you will find the first three digits in:

match.Groups[1].Value

And the second 7 digits in:

match.Groups[2].Value

P.S. In C#, you can use a @"" style string to avoid escaping backslashes. For example, @"\hi\" equals "\\hi\\". Useful for regular expressions and paths.

P.S.2. The first group is stored in Group[1], not Group[0] as you would expect. That's because Group[0] contains the entire matched string.

18
votes

Use grouping and Matches instead.

I.e.:

// NOTE: pseudocode.
Regex re = new Regex("(\\d+)-(\\d+)");
Match m = re.Match(stringToMatch))

if (m.Success) {
  String part1 = m.Groups[1].Value;
  String part2 = m.Groups[2].Value;
  return true;
} 
else {
  return false;
}

You can also name the matches, like this:

Regex re = new Regex("(?<Part1>\\d+)-(?<Part2>\\d+)");

and access like this

  String part1 = m.Groups["Part1"].Value;
  String part2 = m.Groups["Part2"].Value;
13
votes

You can use parentheses to capture groups of characters:

string test = "RR1234566-001";

// capture 2 letters, then 7 digits, then a hyphen, then 1 or more digits
string rx = @"^([A-Za-z]{2})(\d{7})(\-)(\d+)$";

Match m = Regex.Match(test, rx, RegexOptions.IgnoreCase);

if (m.Success)
{
    Console.WriteLine(m.Groups[1].Value);    // RR
    Console.WriteLine(m.Groups[2].Value);    // 1234566
    Console.WriteLine(m.Groups[3].Value);    // -
    Console.WriteLine(m.Groups[4].Value);    // 001
    return true;
}
else
{
    return false;
}
-1
votes
string text = "RR1234566-001";
string regex = @"^([A-Z a-z]{2})(\d{7})(\-)(\d+)";
Match mtch = Regex.Matches(text,regex);
if (mtch.Success)
{
    Console.WriteLine(m.Groups[1].Value);    
    Console.WriteLine(m.Groups[2].Value);    
    Console.WriteLine(m.Groups[3].Value);    
    Console.WriteLine(m.Groups[4].Value);    
    return true;
}
else
{
    return false;
}