0
votes

I have a task to write a C# application to parse an xml file. One of the attribute values in the file is a Replace statement and I have to parse it to create a PowerShell Replace statement. I'm using regex to do this. The string looks like this:

Replace(FileName, ".txt", ".doc")

I want to capture "FileName", ".txt" and ".doc"

My question is, how do I match against the open (left) parens AND the double-quotes ?

My issue is, I can't use

@"\"pattern\""

because the '@' symbol doesn't recognize the escaped double-quotes (in VS 2015). And if I remove the '@', then how do I escape the opening (left) parens ? I can't use

"\("

as an escape sequence b/c the compiler says, "unrecognized escape sequence".

Anyway, all help is appreciated.

3
use two back-to-back double quotes when using string literals - @" double quotes "" "Jonesopolis
Thank you, that did it. I will remember the back-to-back solution on literals.Chewdoggie

3 Answers

2
votes

The regular expression formatted to be readable:

  var pattern =@"
    Replace
    \(
    (?<filename>\w+)
    \,\s*
    \u0022                # double quote
    \.
    (?<txt>\w+)
    \u0022
    ,\s*
    \u0022
    \.
    (?<doc>\w+)

";

The unicode \u0022 is the double quote

The following class parse the text and extract filename , Txt and Doc :

 class RegParser
{
    public string FileNmae { get; set; }
    public string Doc { get; set; }
    public string Txt { get; set; }      




    private static string pattern = @"
    Replace
    \(
    (?<filename>\w+)
    \,\s*
    \u0022                # double quote
    \.
    (?<txt>\w+)
    \u0022
    ,\s*
    \u0022
    \.
    (?<doc>\w+)     
 ";         

    private Regex regex = new Regex(pattern,
           RegexOptions.Singleline
           | RegexOptions.ExplicitCapture
           | RegexOptions.CultureInvariant
           | RegexOptions.IgnorePatternWhitespace
           | RegexOptions.Compiled
           );

    public void Parse(string text)
    {
        Console.WriteLine("text: {0}",text);
        Match m = regex.Match(text);
        FileNmae = m.Groups["filename"].ToString();
        Doc = m.Groups["doc"].ToString();
        Txt = m.Groups["txt"].ToString();        
    }
}  

Try it

sample output:

 text: Replace(FileName, ".txt", ".doc")
 FileNmae: FileName
 Doc:  doc
 Txt:    txt
1
votes

A sample regex could look like

^Replace\((\w+)\s*,\s*("[^"]*")\s*,\s*("[^"]*")\)$

See the regex demo

To define it in C#, you may choose between a regular string literal (that supports escape sequences like \n for a literal newline) where you need to escape the double quote and double escape special regex chars (because a regex engine requires a literal backslash in \d or \. to match a digit or a dot respectively):

var pattern = "^Replace\\((\\w+)\\s*,\\s*(\"[^\"]*\")\\s*,\\s*(\"[^\"]*\")\\)$";

or use a verbatim string literal (that does not parse escape sequences, @"\d" is a string containing 2 chars, \ and d, which matches a digit) to avoid overescaping with \, but you need to double the double quote to insert a literal double quote in the string:

var pattern = @"^Replace\((\w+)\s*,\s*(""[^""]*"")\s*,\s*(""[^""]*"")\)$
0
votes

Try this

 string t = @" name ""test""";
 Console.WriteLine(t);

The output is ==> name "test"