5
votes

I've been spending too much time lately trying to debug some auto-complete-mode functionality in Emacs, this function appears to be non-deterministic and has left me utterly confused.

 (re-search-backward "\\(\\sw\\|\\s_\\|\\s\\.\\|\\s\\\\|[#@|]\\)\\=")

The command is called in a while loop, searching backwards from the current point to find the full "word" that should be autocompleted. For reference, the actual code.

A bit of background and my investigations

I have been trying to setup autocompletion for Javascript, using slime to connect to a Node.js backend.

Autocomplete inside a Slime REPL connected to a Node.js backend is perfect,

enter image description here

Autocomplete inside a js2-mode buffer, connected to Slime, is failing to look up completions from slime. In this image you can see it falling back to the words already in the buffer.

enter image description here

I've tracked this down to Slime's slime-beginning-of-symbol function.

Assume that I'm trying to complete fs.ch where fs has been required and is in scope already, the point is located on after the h character.

In the slime repl buffer the beginning function moves the point all of the way back until it hits whitespace and matches fs.ch.

In the js2-mode buffer the beginning function moves the point only to the dot character and matches only ch.

Reproducing the problem

I've been testing this by evaling (re-search-backward "\\(\\sw\\|\\s_\\|\\s\\.\\|\\s\\\\|[#@|]\\)\\=") repeatedly in various buffers. For all examples, the point starts at the end of the line and moves backwards until the search fails.

  • In the scratch buffer fs.ch the point ends on the c.
  • In the slime repl fs.ch the point ends on the f.
  • In the js2-mode buffer fs.ch the point ends on the c.
  • In an emacs-lisp-mode buffer fs.ch the point ends on the f.

I have no idea why this is happening

I'm going to assume that there's something in these modes that either sets or unsets a global regex var that then has this effect, but so far I've been unable to find or implicate anything.

I even tracked this down to the emacs c code, but at that point realised that I was in completely over my head and decided to ask for help.

Help?

2
\sCODE in an Emacs regexp matches any character whose syntax is CODE. Syntax tables can and will vary between buffers (the syntax table is typically established by the major mode). See C-h i g (elisp) Regexp Backslash RETphils
I think your problem is that . has a different syntax in these buffers. You could check with M-x describe-syntax in the respective buffer. Maybe, you need modify-syntax-entry to correct this. Maybe, you need a temporary syntax-table created with make-syntax-table which inherits the standard for the major mode.Tobias
That's exactly it. I raised the question in #emacs and they said the same. I didn't know about the syntax tables before now.Dan Midwood

2 Answers

1
votes

You should replace \\s\\. with \\s. in your regexp.

0
votes

I "fixed" the problem by redefining the source that gets added to auto complete's ac-sources.

I'm still learning my way around elisp so this is likely the most hack-like way of achieving what I need, but it works.

I changed the regex from:

\\(\\sw\\|\\s_\\|\\s\\.\\|\\s\\\\|[#@|]\\)\\=

to

\\(\\sw\\|\\s_\\|\\s.\\|\\s\\\\|[#@|]\\)\\=

(note the change of \\s\\.\\ to \\s.\\).

And then overrode the auto-complete setup in my init.el. (I'll probably find a hundred ways to refine this when I actually know elisp).

(defun js-slime-beginning-of-symbol ()
  "Move to the beginning of the CL-style symbol at point."
  (while (re-search-backward "\\(\\sw\\|\\s_\\|\\s.\\|\\s\\\\|[#@|]\\)\\="
                             (when (> (point) 2000) (- (point) 2000))
                             t))
  (re-search-forward "\\=#[-+.<|]" nil t)
  (when (and (looking-at "@") (eq (char-before) ?\,))
    (forward-char)))

(defun js-slime-symbol-start-pos ()
  "Return the starting position of the symbol under point.
The result is unspecified if there isn't a symbol under the point."
  (save-excursion (js-slime-beginning-of-symbol) (point)))

(defvar ac-js-source-slime-simple
  '((init . ac-slime-init)
    (candidates . ac-source-slime-simple-candidates)
    (candidate-face . ac-slime-menu-face)
    (selection-face . ac-slime-selection-face)
    (prefix . js-slime-symbol-start-pos)
    (symbol . "l")
    (document . ac-slime-documentation)
    (match . ac-source-slime-case-correcting-completions))
  "Source for slime completion.")

(defun set-up-slime-js-ac (&optional fuzzy)
  "Add an optionally-fuzzy slime completion source to `ac-sources'."
  (interactive)
  (add-to-list 'ac-sources ac-js-source-slime-simple))

In response to my own question about regex global state. There is a lot of it.

Emacs regexes use syntax tables defined in the major mode to determine which characters to match. The reason I was seeing the dot match in the lisp mode but not the js mode was because of different definitions. In the lisp mode '.' is defined as symbol, in js2-mode '.' is defined as punctuation.

As a consequence, an alternative way to fix the problem is to redefine .'s syntax in js2-mode. I tried this out and redefined . as a word with (modify-syntax-entry ?. "w"). However I decided not to stay with that result because it will probably break something down the line.

Also, I have to thank the people in #emacs, they really helped me out on this, teaching me about syntax tables and the horrors of elisp regex globals.