1
votes

i have a basic string and would like to get only specific charaters between the brackets

Base string: This is a test string [more or less]

regex: to capture all r's and e's works just fine.

(r|e) 

=> This is a test string [more or less]

Now i want to use the following regex and group it with my regex to give only r's and e's between the brackets, but unfortunately this doesn't work:

\[(r|e)\]

Expected result should be : more or less

can someone explain?

edit: the problem is very similar to this one: Regular Expression to find a string included between two characters while EXCLUDING the delimiters

but with the difference, that i don't want to get the whole string between the brackets.

Follow up problem

base string = 'this is a link:/en/test/äpfel/öhr[MyLink_with_äöü] BREAK äöü is now allowed'

I need a regex for finding the non-ascii characters äöü in order to replace them but only in the link:...] substring which starts with the word link: and ends with a ] char.

The result string will look like this:

result string = 'this is a link:/en/test/apfel/ohr[MyLink_with_aou] BREAK äöü is now allowed again'

The regex /[äöü]+(?=[^\]\[]*])/g from the solution in the comments only delivers the äöü chars between the two brackets.

I know that there is a forward lookahead with a char list in the regex, but i wonder why this one does not work:

/link:([äöü]+(?=[^\]\[]*])/

thanks

1
Define "doesn't work."T.J. Crowder
"Expected result should be : more or less" It's not clear what you mean by that. Do you mean the result of a single invocation of exec? A loop?T.J. Crowder
it simply does not match. I'm using this regex to substitute all r's and e's for the given string. The first one works fine for the whole string. But the second one, which should only substitute between the brackets does not work. Checking with regex101.com delivers:'Your regular expression does not match the subject string.'mg.vmuc
Since there is no PCRE-like (*SKIP)(?!) nor infinite width lookbehind support in JS regex, you can use a usual hack - /[re]+(?=[^\][]*])/g. If you need more precision, match with /\[[^\][]*]/g and then do what you need to the es and rs only inside the matches.Wiktor Stribiżew
You expect too much from a regex in JS. Don't. Without any details on what you need to do, further chatting makes no sense.Wiktor Stribiżew

1 Answers

0
votes

You can use the following solution: match all between link: and ], and replace your characters only inside the matched substrings inside a replace callback method:

var hashmap = {"ä":"a", "ö":"o", "ü":"u"};
var s = 'this is a link:/en/test/äpfel/öhr[MyLink_with_äöü] BREAK äöü is now allowed';
var res = s.replace(/\blink:[^\]]*/g, function(m) {  // m = link:/en/test/äpfel/öhr[MyLink_with_äöü]
  return m.replace(/[äöü]/g, function(n) { // n = ä, then ö, then ü, 
    return hashmap[n];                     // each time replaced with the hashmap value
  });
});
console.log(res);

Pattern details:

  • \b - a leading word boundary
  • link: - whole word link with a : after it
  • [^\]]* - zero or more chars other than ] (a [^...] is a negated character class that matches any char/char range(s) but the ones defined inside it).

Also, see Efficiently replace all accented characters in a string?