3
votes

I don't have experience with Regex and I'm asking for your help.

I need a regex to capture the JWT inside the following string:

"contextJwt": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJIZWxsbyB5b3UiLCJuYW1lIjoiV2h5IGFyZSB5b3UgY2hlY2tpbmcgbXkgdG9rZW4_ICggzaHCsCDNnMqWIM2hwrApIiwiaWF0IjoxNTE2MjM5MDIyfQ.yAP0xiTwp6vqIYbLKLVBRv-gTyMvU17rT3H8uErLjHA"

Request answer (2363 lines)

Thanks for your time

3
Is the JWT (and the string you shared) arriving as part of a JSON payload? I.e. could you just decode the JSON and read the value out of the contextJwt key?Everett
It arrives in HTML, so it won't workGrampet
What language are you using to parse the HTML?Everett
No language, I make the request to the URL and it comes in a string. I need to capture the JWT after "contextJwt"Grampet
What language are you going to use to process the regular expression? Saying "No language" to me means that you (a human) can look at the text and simply copy and paste the JWT out of it, no coding required.Everett

3 Answers

8
votes

I created a regex which might not be the most elegant but it appears to works.

(^[A-Za-z0-9-_]*\.[A-Za-z0-9-_]*\.[A-Za-z0-9-_]*$)

A more concise version could be also:

(^[\w-]*\.[\w-]*\.[\w-]*$)

However, I believe that also non-latin characters would be allowed which would be disallowed as JWT.

3
votes

If you are working with an HTML document as a string and you are using Javascript to run your regular expression, you could do something like the following:

const html = '<div>stuff</div>something "contextJwt": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJIZWxsbyB5b3UiLCJuYW1lIjoiV2h5IGFyZSB5b3UgY2hlY2tpbmcgbXkgdG9rZW4_ICggzaHCsCDNnMqWIM2hwrApIiwiaWF0IjoxNTE2MjM5MDIyfQ.yAP0xiTwp6vqIYbLKLVBRv-gTyMvU17rT3H8uErLjHA" <div> other stuff</div>';
var regex = /"contextJwt":\s*"(.*)"/;
console.log(html.match(regex)[1]);

/* yields the encoded JWT string:
 eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJIZWxsbyB5b3UiLCJuYW1lIjoiV2h5IGFyZSB5b3UgY2hlY2tpbmcgbXkgdG9rZW4_ICggzaHCsCDNnMqWIM2hwrApIiwiaWF0IjoxNTE2MjM5MDIyfQ.yAP0xiTwp6vqIYbLKLVBRv-gTyMvU17rT3H8uErLjHA

*/

You can tighten up your match from the simple (.*) to the specific characters that are allowed in a valid encoded JWT (per Helio Santo's answer), but since regexes are finicky, I usually start with the simplest solution and only tighten it down when necessary.

What you do with the string that represents an encoded JWT is perhaps another question entirely.

0
votes

For the records (error checking omitted for brevity) here is an alternative that doesn't use Regex

const html = `<section>
  <p>Your token is <code>"contextJwt": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJIZWxsbyB5b3UiLCJuYW1lIjoiV2h5IGFyZSB5b3UgY2hlY2tpbmcgbXkgdG9rZW4_ICggzaHCsCDNnMqWIM2hwrApIiwiaWF0IjoxNTE2MjM5MDIyfQ.yAP0xiTwp6vqIYbLKLVBRv-gTyMvU17rT3H8uErLjHA"</code>. Have a nice day.</p>
</section>`;
const fragment = document.createElement("div");
fragment.innerHTML = html;
const input = fragment.querySelector("section p code").innerHTML;
const output = JSON.parse("{" + input + "}");
console.log(output.contextJwt);