Correct name for a recursive descent parser that uses loops to handle left recursion?

Question

This grammar is left recursive:

Expression  ::= AdditionExpression

AdditionExpression  ::=
    MultiplicationExpression
        | AdditionExpression '+' MultiplicationExpression
        | AdditionExpression '-' MultiplicationExpression

MultiplicationExpression    ::=
    Term
        | MultiplicationExpression '*' Term
        | MultiplicationExpression '/' Term

Term    ::=
    Number
        | '(' AdditionExpression ')'

Number  ::=
    [+-]?[0-9]+(\.[0-9]+)?

So in theory, recursive descent won't work. But by exploiting the properties of the grammar that each left-recursive rule corresponds to a specific precedence level, and that lookahead of a single token is enough to choose the correct production, the left-recursive rules can be individually parsed with while loops.

For example, to parse the AdditionExpression non-terminal, this pseudocode suffices:

function parse_addition_expression() {
    num = parse_multiplication_expression()
    while (has_token()) {
        get_token()
        if (current_token == PLUS)
            num += parse_multiplication_expression()
        else if (current_token == MINUS)
            num -= parse_multiplication_expression()
        else {
            unget_token()
            return num
        }
    }
    return num
}

What is the correct name for this type of parser? This informative article only refers to it as the "Classic Solution": https://www.engr.mun.ca/~theo/Misc/exp_parsing.htm

There must be a proper name for this type of parser.

For me it is not a kind of parser, it is just the application of left recursion removal combined with a recursive descent parser. See this question for a technique to remove left recursion. — AProgrammer, Apr 24 '17 at 18:16
I think you might be correct. It does resemble a run-time equivalent of the left-recursion removal algorithm. — user71015, Apr 24 '17 at 19:30
Please don't use the 'answer' box to post comments or other remarks. If you create an account, you'll retain access and be able to accept the answer that helped you most. If you entered an email and lost access, you can recover access. If you didn't enter an email address and don't have access to the browser/cookies you used to post the question, you're probably out of luck. No one else can accept the answer for you -- not even moderators. — D.W., Apr 24 '17 at 23:56

score 11 · Answer 1 · answered Apr 24 '17 at 20:17

It is just an LL(1) parser implemented with recursive descent.

Starts with:

AdditionExpression  ::=
    MultiplicationExpression
        | AdditionExpression '+' MultiplicationExpression
        | AdditionExpression '-' MultiplicationExpression

apply left-recursion removal to get an LL(1) grammar:

AdditionExpression  ::= 
    MultiplicationExpression AdditionExpressionTail

AdditionExpressionTail ::=
        | '+' MultiplicationExpression AdditionExpressionTail
        | '-' MultiplicationExpression AdditionExpressionTail

write the corresponding functions:

function parse_AdditionExpression() {
    parse_MultiplicationExpression()
    parse_AdditionExpressionTail()
}

function parse_AdditionExpressionTail() {
    if (has_token()) {
        get_token()
        if (current_token == PLUS) {
            parse_MultiplicationExpression()
            parse_AdditionExpressionTail()
        } else if (current_token == MINUS) {
            parse_MultiplicationExpression()
            parse_AdditionExpressionTail()
        } else {
            unget_token()
        }
    }
}

remove tail recursion:

function parse_AdditionExpressionTail() {
    while (has_token()) {
        get_token()
        if (current_token == PLUS)
            parse_MultiplicationExpression()
        else if (current_token == MINUS)
            parse_MultiplicationExpression()
        else {
            unget_token()
            return
        }
    }
}

inline:

function parse_AdditionExpression() {
    parse_MultiplicationExpression()
    while (has_token()) {
        get_token()
        if (current_token == PLUS)
            parse_MultiplicationExpression()
        else if (current_token == MINUS)
            parse_MultiplicationExpression()
        else {
            unget_token()
            return
        }
    }
}

and you have just to add the semantic processing to get your function.

score 5 · Answer 2 · answered Apr 24 '17 at 18:29

5

You want to look into LL($k$) parsing. The Wikipedia article is mostly useless, but it's basically recursive descent with $k$ symbols lookahead.

There is also LL($*$) which permits unbounded lookahead.

See here for a comprehensive overview on how powerful this class of parsers is.

answered Apr 24 '17 at 18:29

Raphael

72,336
29
179
389

1

I don't see how this is related. The code does not use more than one symbol of look-ahead. – AProgrammer Apr 24 '17 at 18:59
@AProgrammer So it's an LL(1) parser, or very closely related. – Raphael Apr 24 '17 at 19:53
It's an LL(1) parser. I expanded my comment into an answer. – AProgrammer Apr 24 '17 at 20:18
2

@AProgrammer I don't see how a second answer was needed. LL(1) is LL(k) for k=1 (isn't that obvious?). But well. – Raphael Apr 25 '17 at 07:42
LL(k) is not enough. For left recursion the parser might need arbitrarily big lookahead. – hugomg Jul 02 '23 at 00:39

Correct name for a recursive descent parser that uses loops to handle left recursion?

2 Answers2