1
votes

Good day,

I'm attempting to parse a text file containing several Windows paths; I'd like to use regular expressions if possible, and I'm using VB.NET.

The file is formatted somewhat like so:

M - Network Mode  
C:\Client\System\ - System Path
C:\Client\Products\ - Product Path
C:\Client\Util\ - Utility Path
C:\PROG\ - Program Path

Et cetera. The first line contains a single letter before the "description"-- that is, a space, a hyphen, a space, and then a description of the field. Each successive line in the file contains a Windows path (always with trailing backslash), and similarly followed by the hyphen and description. The entire file is usually no more than 30 lines.

At first, I thought to read the text of the file line-by-line and use VB's Split() method to separate the path and the description, storing the paths in one array, and the descriptions in another. Ideally, though, I'd like to make use of a regular expression to simply parse the file for the paths, and then the text after the hyphen. I'm relatively inexperienced with regex, what would be the best way to go about it? Is there perhaps a way to have the Regex return a collection of all matches, e.g., all the matches for file paths, and another for all the matches of the text after the hyphen?

Thanks.

2

2 Answers

0
votes

This seems to work on your test data

(?<Path>.+\\)\s\-\s(?<Description>.+)

Usage:

Private oRegEx as RegEx = New RegEx("(?<Path>.+\\)\s\-\s(?<Description>.+)", RegExOptions.Compiled)
Public Sub DoTheMatching()
   Dim tInputContent as String = String.Empty   '  Fill this with your file contents
   Dim tPath as String = String.Empty
   Dim tDescription As String = Stringh.EMpty
   For Each tMatch as Match In oRegEx.Matches(tInputContent)
     tPath = tMatch.Groups("Path").Value
     tDescription = tMatch.Groups("Description").Value
   Next
End Sub

I did not compile this, there may be typos.

0
votes

This (very simple, and subject to enhancement) regex does what you want:

^(.+\\) - (.+)$

You can apply it to each line in your log file, and then use backreferences (\1 and \2) to capture the corresponding matches (path (including the trailing backslash) and the description).

This should work correctly even with "strange" filename entries like this:

C:\Some - \ - WeirdPath\ - Description

(\1 returns "C:\Some - \ - Filename\" and \2 returns "Description").