6

This question asks which programming languages have a syntax that cannot be described by deterministic context-free grammars - the answer is "Many [...] including Algol 60, C, and C++".

Until recently I thought Lua was an example of a language which could be described by a deterministic context-free grammar. This belief was reinforced by looking at The Complete Syntax of Lua as described in the language reference. Unfortunately it turns out that this grammar has a well-known ("function call x new statement") ambiguity, and that Lua must be parsed using a "greedy" algorithm, not a deterministic one.

Which programming languages have a syntax that can be described by deterministic context-free grammars?

Andrej Bauer
  • 30,396
  • 1
  • 70
  • 117
user200783
  • 161
  • 3
  • 1
    I don't know, but I strongly suspect that most of the Wirth languages (e.g. Pascal and the Modula family) have unambiguous grammars, because they are operator grammars. Algol is a special case because it predates a lot of parsing theory; it doesn't even have a finite number of grammar productions and it was eventually formalised with a grammar which generates the grammar (known as a van Wijngaarden grammar). Note that any language with user-defined operators can't be described by a fixed grammar, let alone a deterministic one. – Pseudonym Jul 18 '16 at 06:52
  • The question makes sense either way, because the languages mentioned in the question are statically ambiguous, and can only be disambiguated using semantic information. A famous example from C (and C++) is (A)B which can be interpreted as either multiplying A by B or casting B to type A, depending on whether A is the name of a variable or a type. Well, I don't know about Lua... – Pseudonym Jul 18 '16 at 08:32
  • @Pseudonym - When you say the Wirth langauges are "operator grammars", is this the same thing as operator-precedence grammars? – user200783 Jul 18 '16 at 09:26
  • 1
    No. A programming language has an operator grammar if any two identifiers are separated by at least one operator or keyword. C and C++ do not have grammars because you can say, for example, "my_type foo();", where "my_type" and "foo" are both identifiers. In fact, this very example is an ambiguity in C++: it could be declaring a function or could be calling a constructor. – Pseudonym Jul 18 '16 at 12:51
  • 4
    Surely Lisp's S-expressions are deterministic context free? – Joey Eremondi Jul 18 '16 at 18:58
  • The Van Wijngaarden grammar for Algol 68 also specifies that variables must be declared before use and the like. Do you consider this part of syntax? – reinierpost Jul 18 '16 at 19:02
  • @Pseudonym, you confuse Algol 60 (more or less contemporaneous with the development of context free grammars and parsing theory, probably deterministic with its large alphabet -- includes all sort of strange symbols and what we today call reserved words) with the go-for-broke Algol 68, where you have to custom-build a grammar for each program. Wirth languages were carefully designed for recursive descent parsing, thus al least almost deterministic. – vonbrand May 02 '20 at 19:45

0 Answers0