Syntax propertizing an exception to Python comment rule

Question

I want to write a mode for the Cadabra 2 computer algebra system, which has Python-like syntax, with an exception of having the hash character # as a wild card in expressions, in addition to being a comment starter like in regular Python. Consider the following code:

\dalembert{ A?? }::LaTeXForm(\Box A??).
{ \mu, \nu }::Indices(vector).
\dalembert{#}::Derivative.
\partial{#}::PartialDerivative.
{ A_{\mu} }::Depends(\partial{#}).
QED Lagrangian
lagrangian:= (a/2) A_{\mu} \dalembert{ A_{\mu} }
    + (b/2) A_{\mu} \partial_{\nu}{\partial_{\mu}{ A_{\nu} }}
    + (1/2) m*2 A_{\mu}  A^{\mu};

Here, the QED Lagrangian line is a comment, but }::Derivative. is not. I want Emacs to interpret # as a comment delimiter only if it is not enclosed by curly or regular braces, and am struggling to implement that. The way I believe it should be done is with the syntax propertization function:

;;;###autoload
(define-derived-mode my-cadabra2-mode python-mode "cadabra2 mode"
  "Major mode for Cadabra 2 computer algebra system"
  (setq-local font-lock-defaults '((my-cadabra2-font-lock-keywords)))
  (setq-local syntax-propertize-function
              (syntax-propertize-rules ("\\(?<!{\s*\\)#\\(?!\s*}\\)" (1 "< ")))))

I believe my regular expression is correct, it should match # if not preceded by { and not followed by } with any whitespace in between. Emacs still uses any hash as a comment starter however.

How do I implement this properly?

Well, just trying to eval your function in my *scratch* buffer gives an (invalid-regexp) in your argument to syntax-propetize-rules. — nega, May 14 '22 at 18:42
Looks like you've got a negative-lookbehind and a negative-lookahead, both of which aren't supported. Also it looks like you're trying to use \s to match whitespace. You want \s- instead. See https://www.emacswiki.org/emacs/RegularExpression for a good primer on emacs regexes. — nega, May 14 '22 at 19:23
@nega I'm trying to make this work with the simplest example, I tried (rx (and "#" (not "}"))) which evaluates just fine and should catch the non-comments in the provided example, but still does not work when I load a file with the mode — Andrii Kozytskyi, May 17 '22 at 09:57

score 1 · Accepted Answer · answered May 17 '22 at 16:11

Based on your recent comment it is working, except for the fact that you're not overriding the comment character, you're adding properties to it. Effectively your (define-derived-mode ...) is saying "add this additional property to the comment character" and not "set the comment character to this property".

Additionally, you want to use 0 in your rule. The integer there is the matching sub-group of the regex in the rule, with 0 being "the whole match".

With your (slightly modified) sample text

# -*- mode: my-cadabra2 -*-
bol comment
\dalembert{ A?? }::LaTeXForm(\Box A??). # inline comment
{ \mu, \nu }::Indices(vector).
\dalembert{#}::Derivative.
\partial{#}::PartialDerivative.
{ A_{\mu} }::Depends(\partial{#}).
QED Lagrangian
lagrangian:= (a/2) A_{\mu} \dalembert{ A_{\mu} }
    + (b/2) A_{\mu} \partial_{\nu}{\partial_{\mu}{ A_{\nu} }}
    + (1/2) m*2 A_{\mu}  A^{\mu};

And this version of your defined of your define-derived-mode (note the modify-syntax-entry line)

;;;###autoload
(define-derived-mode my-cadabra2-mode python-mode "cadabra2 mode"
  "Major mode for Cadabra 2 computer algebra system"
  (setq-local font-lock-defaults '(python-font-lock-keywords))
  (modify-syntax-entry ?# ".") ;; set # to be punctuation
  (set (make-local-variable 'syntax-propertize-function)
              (syntax-propertize-rules ((rx (and "#" (not "}"))) (0 "<")))))

I get:

Is there a function to force drawing the propertized syntax? It takes a while for Emacs to start displaying normal comments properly after startup now — Andrii Kozytskyi, May 17 '22 at 21:36

Syntax propertizing an exception to Python comment rule

QED Lagrangian

1 Answers1

bol comment

QED Lagrangian