16
votes

I would like to be able to navigate by sentence in Emacs (M-a, M-e). Here's the problem: by default, Emacs expects that each sentence is separated by two spaces, and I'm used to just putting a single space. Of course, that setting can be turned off, to allow for sentences separated by only a single space, like so:

(setq sentence-end-double-space nil)   

But then Emacs thinks that a sentence has ended after abbreviations with a full stop ("."), e.g. after something like "...a weird command, e.g. foo...".

So rather than using the above code, is there a way to define the sentence-end variable so that it counts [.!?] as marking the end of the sentence, iff what follows is one or more spaces followed by a capital letter [A-Z]?

And...to also allow [.!?] to mark the end of a sentence, if followed by zero or more spaces followed by a "\"? [The reason for this latter condition is for writing LaTeX code: where a sentence is followed by a LaTeX command like \footnote{}, e.g. "...and so we can see that the point is proved.\footnote{In some alternate world, at least.}"]

I tried playing around with the definition of sentence-end, and came up with:

(setq sentence-end "[.!?][]'\")}]*\\(\\$\\|[ ]+[A-Z]\\|[ ]+[A-Z]\\| \\)[
 ;]*")

But this doesn't seem to work at all.

Any suggestions?

1
Dr. Shivago. E.g. Canada. St. George.Svante
Ah, right. Hadn't thought of that. There's no way round that one, I suppose, except actually putting two spaces between sentences to differentiate. But in the majority of what I write, I don't use titles like "Dr." or "St.", and I'm not usually listing proper names after "e.g.". So it would still work 99% of the time for me, if I could figure out how to define it. [Plus, I'm still curious how to allow for backslashed-LaTeX commands not to disrupt the sentence-end.]emacsomancer
@Svante ! not everyone spells their names the way YOU do. It's Zhivago. jeez.Cheeso
BTW, does the existing sentence-end value in Emacs suits also modern Indian languages or Sanskrit the way it is usually typed in Devanagari? If not, can we suggest an improvement?..imz -- Ivan Zakharyaschev
@imz: Devanagari traditionally uses straight-verticle-line-like marks for punctuation: a single | is often vaguely like a comma or semi-colon, and a double || is a full-stop (sometimes only | is used, and then like a full stop). However, modern Hindi etc even when written in Devanagari would often use ',' '.' for punctuation. For Emacs, the modern way would work with the current definition (as long as sentences are separated with two spaces); the traditional symbols would require some additions to the current Emacs definition.emacsomancer

1 Answers

4
votes

I don't think sentence-end will do what you need it to do. You really need look-ahead regexps for this, and Emacs doesn't support them.

You can roll your own function to do what you need though. I don't understand all of your requirements, but the following is a start:

(defun my-next-sentence ()
"Move point forward to the next sentence.
Start by moving to the next period, question mark or exclamation.
If this punctuation is followed by one or more whitespace
characters followed by a capital letter, or a '\', stop there. If
not, assume we're at an abbreviation of some sort and move to the
next potential sentence end"
  (interactive)
  (re-search-forward "[.?!]")
  (if (looking-at "[    \n]+[A-Z]\\|\\\\")
      nil
    (my-next-sentence)))

(defun my-last-sentence ()
  (interactive)
  (re-search-backward "[.?!][   \n]+[A-Z]\\|\\.\\\\" nil t)
  (forward-char))

Most of your tweaking will need to focus on the looking-at regexp, to make sure it hits all the potential end-of-sentence conditions you need. It would be relatively easy to modify it to move the cursor to particular locations based on what it finds: Leave it be if it's a normal sentence, move past the next { if you're at a latex command, or whatever suits you.

Once you've got that working, can bind the functions to a M-a and M-e, probably using mode-hooks unless you want to use them for every mode.