9

I'm trying to construct a regular expression that would match symbols, that is, things like function names and the like, according to current major mode settings (it's called syntax table if I remember correctly). After some investigation, I have found these useful regexps:

  • \_< matches beginning of a symbol

  • \_> matches end of a symbol

Now I think I need to find out how to represent symbol-constituent character. We have \w for word-constituent characters, but I cannot find anything for symbols. Regexp to match symbols should be something like (assuming that \s matches on symbol-constituent characters):

\_<\s+\_>

Am I missing something? How to match on symbols?


Note that matching on words does not work for me. Trivial example is something like foo-bar that is a symbol in Emacs Lisp mode, but not a word (because - is not a word-constituent character).

Mark Karpov
  • 4,943
  • 1
  • 26
  • 54
  • 3
    I typically use (re-search-forward "\\_<\\(?:\\sw\\|\\s_\\)+\\_>" nil t). It works, but I wonder myself if there's a shorter way. – abo-abo Mar 26 '16 at 13:02
  • 2
    abo-abo: Ugly as it is, AFAIK that is what you need to do, given symbols can contain both symbol-constituent characters and word-constituent characters. You should make it an answer. Mark, you'll want to check the manual to see what \s actually means, as it's entirely different to your assumption. – phils Mar 26 '16 at 13:53
  • @phils, I didn't even know that \s has any meaning in Emacs regexps, I just picked that symbol for example. – Mark Karpov Mar 26 '16 at 13:56
  • @Mark: See the Elisp manual, node Regexp Backslash. – Drew Mar 26 '16 at 17:09

1 Answers1

7

What is wrong with \_<.*?\_> ?