49
votes

I am trying to write a function for excel 2010 that will take a cell of unstructured text, look for something called an sdi value and, if found, return that number. The sdi value will appear as sdi ####. What I want is to return sdi and the sepecific numbers that follow it, so if the cell contains "some text sdi 1234 some more text" the function will return sdi 1234.

This is my function:

Function SdiTest(LookIn As String) As String
  Dim temp As String
  Dim STA As Object
  temp = ""

  Set SDI = CreateObject("VBScript.RegExp")
  SDI.IgnoreCase = True
  SDI.Pattern = "sdi [1-9]*"
  SDI.Global = True

  If SDI.Test(LookIn) Then
    temp = SDI.Execute(LookIn)
  End If

  SdiTest = temp
End Function

If there is no sdi number it never enters the if statement and dutifully returns the empty string. If there is an sdi number I get #VALUE!

What am I missing?

Yes, VBScript is enabled. Additionally, I am finding it frustrating to use regex in VBA, and hard to find useful info online. Links to good online resources would be appreciated.

Thank you

1

1 Answers

80
votes

You need to access the matches in order to get at the SDI number. Here is a function that will do it (assuming there is only 1 SDI number per cell).

For the regex, I used "sdi followed by a space and one or more numbers". You had "sdi followed by a space and zero or more numbers". You can simply change the + to * in my pattern to go back to what you had.

Function ExtractSDI(ByVal text As String) As String

Dim result As String
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")

RE.pattern = "(sdi \d+)"
RE.Global = True
RE.IgnoreCase = True
Set allMatches = RE.Execute(text)

If allMatches.count <> 0 Then
    result = allMatches.Item(0).submatches.Item(0)
End If

ExtractSDI = result

End Function

If a cell may have more than one SDI number you want to extract, here is my RegexExtract function. You can pass in a third paramter to seperate each match (like comma-seperate them), and you manually enter the pattern in the actual function call:

Ex) =RegexExtract(A1, "(sdi \d+)", ", ")

Here is:

Function RegexExtract(ByVal text As String, _
                      ByVal extract_what As String, _
                      Optional seperator As String = "") As String

Dim i As Long, j As Long
Dim result As String
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")

RE.pattern = extract_what
RE.Global = True
Set allMatches = RE.Execute(text)

For i = 0 To allMatches.count - 1
    For j = 0 To allMatches.Item(i).submatches.count - 1
        result = result & seperator & allMatches.Item(i).submatches.Item(j)
    Next
Next

If Len(result) <> 0 Then
    result = Right(result, Len(result) - Len(seperator))
End If

RegexExtract = result

End Function

*Please note that I have taken "RE.IgnoreCase = True" out of my RegexExtract, but you could add it back in, or even add it as an optional 4th parameter if you like.