0
votes

How to match some pattern using regular expression after SomeText

Suppose I want to find email address, then I should get only:

[email protected]
[email protected]

But I should not get the emails written above SomeText, using regex in javascript.

I have a text file some thing like this:

In theoretical computer science and formal language theory, a regular expression (sometimes called a rational expression)[1][2] is a sequence of characters that define a search pattern, mainly for use in pattern matching with strings, or string matching, i.e. "find and replace"-like operations. The concept arose in the 1950s, when the American [email protected] mathematician Stephen Kleene formalized the description of a regular language, and came into common use with the Unix text processing utilities ed, an editor, and grep, a filter.

[email protected]

SomeText

name1/occupation1/state1

[email protected]

Regexps are so useful in computing that the various systems to specify regexps have evolved to provide both a basic and extended standard for the grammar and syntax; modern regexps heavily augment the standard. Regexp processors are found in several search engines, search and replace dialogs of several word processors and text editors, and in the command lines of text processing utilities, such as sed and AWK.

name2/occupation2/state2

[email protected]

4
Hint: capturing groupanubhava
but how to get all the results after SomeText? please explainPrecision
Use indexOf and substring (or split) to get the text after sometext,and then match what you need.Wiktor Stribiżew
@Gopalkrishnasudhanshu -- you need to be more specific both about your problem (what makes "SomeText" special, while the following text "name1/occupation1..." is ignored) and about what you have already tried.Michael Lorton
SomeText is a specific text like a sub-header. and i have so many matching below and above some text. but i am interested in only the matching emails below SomeText.@MalvolioPrecision

4 Answers

1
votes

I haven't found a way to get both email addresses after the "SomeText", so this is my suggestion.

Strip off all of the text before the key word. Then just use a simpler regex for email addresses. The regex below is the 'official' one from emailregex but something like "([\w\d]+@\w+.\w+)" would work fairly well and is a little easier to understand :)

str = str.substring(str.indexOf("SomeText") + 1);
results = str.match(/^[-a-z0-9~!$%^&*_=+}{\'?]+(\.[-a-z0-9~!$%^&*_=+}{\'?]+)*@([a-z0-9_][-a-z0-9_]*(\.[-a-z0-9_]+)*\.(aero|arpa|biz|com|coop|edu|gov|info|int|mil|museum|name|net|org|pro|travel|mobi|[a-z][a-z])|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}))(:[0-9]{1,5})?$/mg);
1
votes

You can use replace with a callback:

var emails=[];

s.replace(/\bSomeText([\s\S]+)$/, function($0, $1) {
   $1.match(/[^\s@]+@\S+/g).map(function(e){ emails.push(e) });
   return $0;
})

console.log(emails);
// ["[email protected]", "[email protected]"]

PS: Regex to find email address [^\s@]+@\S+ is pretty basic one here and email addresses can be pretty complicated.

1
votes

Your solution:

var string   = '\nIn theoretical computer science and formal language theory, a regular expression (sometimes called a rational expression)[1][2] is a sequence of characters that define a search pattern, mainly for use in pattern matching with strings, or string matching, i.e. "find and replace"-like operations. The concept arose in the 1950s, when the American [email protected] mathematician Stephen Kleene formalized the description of a regular language, and came into common use with the Unix text processing utilities ed, an editor, and grep, a filter.\n\[email protected]\n\nSomeText\n\nname1/occupation1/state1\n\[email protected]\n\nRegexps are so useful in computing that the various systems to specify regexps have evolved to provide both a basic and extended standard for the grammar and syntax; modern regexps heavily augment the standard. Regexp processors are found in several search engines, search and replace dialogs of several word processors and text editors, and in the command lines of text processing utilities, such as sed and AWK.\n\nname2/occupation2/state2\n\[email protected]';
var someText = 'SomeText';
var regExp   = new RegExp('\\S+@\\S+\\.\\S+','g');
var emails   = string.split(someText)[1].match(regExp);
console.log(emails);
// ["[email protected]", "[email protected]"]

Don't forget to use your RegExp for searching emails. I've provided simplest example.

0
votes

You can do something like below

    var str='your text form which you need to find the email ids';

    str=str.replace(/\r\n/g,'##') // need to get all the text in one line otherwise your backrefernce will not work.

    str=str.replace(/.*sometext(.*)/i,"$1") // remove text before sometext

    str.match(/[A-Za-z0-9]+@[A-Za-z]+\.[A-Za-z]+/g)