Algorithms to match regular expressions containing backreferences

Asked Aug 05 '17 at 00:45

Active Aug 08 '17 at 17:46

Viewed 137 times

I'm trying to come up with an implementation of a matcher for regular expressions containing backreferences like:

([a-c])x\1 which would match axa, bxb and cxc but nothing else.

While I've seen a number of posts about the theory about what is the class of languages this type of regexes describe, I didn't manage to find some more concrete implementation details beyond the fact that matchers for Context Sensitive Languages are Linear Bound Automata.

Could you point out some resources about the implementation of such matchers?

UPDATE: Currently this is my best reference: "Extending Finite Automata to Efficiently Match Perl-Compatible Regular Expressions"

PS: These regex features are supported by most standard libraries of modern programming languages (Perl, Java, C# etc), but I wouldn't start there since I believe those implementations are quite terse.

edited Aug 08 '17 at 17:46

asked Aug 05 '17 at 00:45

Radu Stoenescu

1

Implementation is off-topic here. Algorithms are on-topic. – Yuval Filmus Aug 05 '17 at 08:00
Closely related question. – Raphael Aug 08 '17 at 18:00

Algorithms to match regular expressions containing backreferences

0 Answers0