2
votes

I am trying to extract a portion of text that is embedded within parenthesis in a text string:

"Dominion Diamond Corporation (DDC) "

(I want to extract DDC).

Perusing the interwebs suggests that the regular expression

"\([^)]*\)"

will be useful.

I try the following:

ret = Regex(regExp)
match(ret, "Dominion Diamond Corporation (DDC) ")

Output:

RegexMatch("Dominion Diamond Corporation (DDC", 1="Dominion Diamond Corporation (DDC")

However, when i enter the regex expression into the match function directly:

match(r"\([^)]*\)"t, "Dominion Diamond Corporation (DDC) ")

The output is:

RegexMatch("(DDC)")

Why / how are these two expressions different? How do I interpolate an arbitrary regex expression into the first arg for match?

1
r"string" usually means raw string (i.e. Python). I suspect it was stripping the backslashes before. I have no clue why there's a t at the end, though.Laurel
looking back at my notebook, I think the "t" was a copy / paste errorpyrex

1 Answers

5
votes

As @Laurel suggests in a comment, the single backslashes weren't making it through to the match function.

julia> rstring = "\\([^)]*\\)"
"\\([^)]*\\)"

julia> match(Regex(rstring), "Dominion Diamond Corporation (DDC) ")
RegexMatch("(DDC)")