5
votes

I'm trying to use CodeMirror simple mode to create my own editor and highlight some custom keywords. However, it's highlighting occurrences of these words inside other words. Here's my code to define the mode of the editor:

    CodeMirror.defineSimpleMode("simple", {
  // The start state contains the rules that are intially used
  start: [
    // The regex matches the token, the token property contains the type
    {regex: /["'](?:[^\\]|\\.)*?(?:["']|$)/, token: "string"},
    {regex: /;.*/, token: "comment"},
    {regex: /\/\*/, token: "comment", next: "comment"},

    {regex: /[-+\/*=<>!]+/, token: "operator"},
    {regex: /[\{\[\(]/, indent: true},
    {regex: /[\}\]\)]/, dedent: true},

    //Trying to define keywords here
    {regex: /\b(?:timer|counter|version)\b/gi, token: "keyword"} // gi for case insensitive
  ],
  // The multi-line comment state.
  comment: [
    {regex: /.*?\*\//, token: "comment", next: "start"},
    {regex: /.*/, token: "comment"}
  ],
  meta: {
    dontIndentStates: ["comment"],
    lineComment: ";"
  }
});

When I type in the editor, this is what gets highlighted. I would expect the first two occurrences to be styled, but not the second two.

enter image description here

It's obviously something incorrect with this regular expression:

/\b(?:timer|counter|version)\b/gi

But I've tried it several different ways and the same pattern works correctly in other regex testers. Example: https://regex101.com/r/lQ0lL8/33 . Any advice?

Edit #1:

Tried this pattern in codemirror definition, dropping the /g but it still yields the same incorrect highlighting.

{regex: /\b(?:timer|counter|version)\b/i, token: "keyword"}
1
You should drop the /g modifier: /\b(?:timer|counter|version)\b/i. I don't know if it's the cause of your problem, but it definitely isn't needed. Otherwise, the regex looks fine.Alan Moore
@AlanMoore Thanks, I did try that but still got the same result. Removing the /gmodifier limited my matches here though.colinwurtz
What does it do with the word timerNO? That is, does the \b at the end work?Alan Moore
@AlanMoore this pattern {regex: /\b(?:timer|counter|version)\b/i, token: "keyword"} does not highlight timerNO. Does it seem like it's not respecting the /b at the beginning?colinwurtz
I suspect it's treating the beginning of the match as the beginning of the string. If that's the case, then a regex like /\b!bar/ won't match anywhere, even in foo!bar.Alan Moore

1 Answers

6
votes

I ended up just defining my own mode from scratch and the additional customization seems to have worked. I parse the stream by word, convert to lowercase, then check if it's in my list of keywords. Using this approach it seems very straightforward to add additional styles and keywords.

var keywords = ["timer", "counter", "version"];

CodeMirror.defineMode("mymode", function() {

  return {
    token: function(stream, state) {
      stream.eatWhile(/\w/);

      if (arrayContains(stream.current(), keywords)) {
        return "style1";
      }
      stream.next();
    }
  };

});


var editor = CodeMirror.fromTextArea(document.getElementById('cm'), {
  mode: "mymode",
  lineNumbers: true
});

function arrayContains(needle, arrhaystack) {
  var lower = needle.toLowerCase();
  return (arrhaystack.indexOf(lower) > -1);
}

Working Fiddle