13
votes

I need a regex that gives me the word before and after a specific word, included the search word itself.

Like: "This is some dummy text to find a word" should give me a string of "dummy text to" when text is my search word.

Another question, it's possible that the string provided will contain more then once the search word so I must be able to retrieve all matches in that string with C#.

Like "This is some dummy text to find a word in a string full with text and words" Should return:

  • "dummy text to"
  • "with text and"

EDIT: Actually I should have all the matches returned that contain the search word. A few examples: Text is too read. -> Text is

Read my text. -> my text

This is a text-field example -> a text-field example

5
And what if the string is "I need to text text to dummy"? Should it return "to text text" and "text text to"? - Jim Mischel
indeed, just the word before and after my search word, whatever it may be :) - PitAttack76
What about "one text two text three"? i.e. Do you need to handle overlapping matches? - ridgerunner

5 Answers

17
votes

EDIT:

If you want to grab all the content from the space before first word to the space after the word use:

(?:\S+\s)?\S*text\S*(?:\s\S+)?

A simple tests:

string input = @"
    This is some dummy text to find a word in a string full with text and words
    Text is too read
    Read my text.
    This is a text-field example
    this is some dummy [email protected] to read";

var matches = Regex.Matches(
    input,
    @"(?:\S+\s)?\S*text\S*(?:\s\S+)?",
    RegexOptions.IgnoreCase
);

the matches are:

dummy text to
with text and
Text is
my text.
a text-field example
dummy [email protected] to
7
votes
//I prefer this style for readability

string pattern = @"(?<before>\w+) text (?<after>\w+)";
string input = "larry text bob fred text ginger fred text barney";
MatchCollection matches = Regex.Matches(input, pattern);

for (int i = 0; i < matches.Count; i++)
{
    Console.WriteLine("before:" + matches[i].Groups["before"].ToString());
    Console.WriteLine("after:" + matches[i].Groups["after"].ToString());
} 

/* Output:
before:larry
after:bob
before:fred
after:ginger
before:fred
after:barney
*/
2
votes
/[A-Za-z'-]+ text [A-Za-z'-]+/

Should work in most cases, including hyphenated and compound words.

1
votes
([A-z]+) text ([A-z]+)

would do nicely

0
votes

[a-zA-Z]+\stext\s[a-zA-Z]+

I believe this will work nicely