9

This grammar is left recursive:

Expression  ::= AdditionExpression

AdditionExpression  ::=
    MultiplicationExpression
        | AdditionExpression '+' MultiplicationExpression
        | AdditionExpression '-' MultiplicationExpression

MultiplicationExpression    ::=
    Term
        | MultiplicationExpression '*' Term
        | MultiplicationExpression '/' Term

Term    ::=
    Number
        | '(' AdditionExpression ')'

Number  ::=
    [+-]?[0-9]+(\.[0-9]+)?

So in theory, recursive descent won't work. But by exploiting the properties of the grammar that each left-recursive rule corresponds to a specific precedence level, and that lookahead of a single token is enough to choose the correct production, the left-recursive rules can be individually parsed with while loops.

For example, to parse the AdditionExpression non-terminal, this pseudocode suffices:

function parse_addition_expression() {
    num = parse_multiplication_expression()
    while (has_token()) {
        get_token()
        if (current_token == PLUS)
            num += parse_multiplication_expression()
        else if (current_token == MINUS)
            num -= parse_multiplication_expression()
        else {
            unget_token()
            return num
        }
    }
    return num
}

What is the correct name for this type of parser? This informative article only refers to it as the "Classic Solution": https://www.engr.mun.ca/~theo/Misc/exp_parsing.htm

There must be a proper name for this type of parser.

Raphael
  • 72,336
  • 29
  • 179
  • 389
user71015
  • 91
  • 3
  • For me it is not a kind of parser, it is just the application of left recursion removal combined with a recursive descent parser. See this question for a technique to remove left recursion. – AProgrammer Apr 24 '17 at 18:16
  • I think you might be correct. It does resemble a run-time equivalent of the left-recursion removal algorithm. – user71015 Apr 24 '17 at 19:30
  • 1
    Please don't use the 'answer' box to post comments or other remarks. If you create an account, you'll retain access and be able to accept the answer that helped you most. If you entered an email and lost access, you can recover access. If you didn't enter an email address and don't have access to the browser/cookies you used to post the question, you're probably out of luck. No one else can accept the answer for you -- not even moderators. – D.W. Apr 24 '17 at 23:56

2 Answers2

11

It is just an LL(1) parser implemented with recursive descent.

Starts with:

AdditionExpression  ::=
    MultiplicationExpression
        | AdditionExpression '+' MultiplicationExpression
        | AdditionExpression '-' MultiplicationExpression

apply left-recursion removal to get an LL(1) grammar:

AdditionExpression  ::= 
    MultiplicationExpression AdditionExpressionTail

AdditionExpressionTail ::=
        | '+' MultiplicationExpression AdditionExpressionTail
        | '-' MultiplicationExpression AdditionExpressionTail

write the corresponding functions:

function parse_AdditionExpression() {
    parse_MultiplicationExpression()
    parse_AdditionExpressionTail()
}

function parse_AdditionExpressionTail() {
    if (has_token()) {
        get_token()
        if (current_token == PLUS) {
            parse_MultiplicationExpression()
            parse_AdditionExpressionTail()
        } else if (current_token == MINUS) {
            parse_MultiplicationExpression()
            parse_AdditionExpressionTail()
        } else {
            unget_token()
        }
    }
}

remove tail recursion:

function parse_AdditionExpressionTail() {
    while (has_token()) {
        get_token()
        if (current_token == PLUS)
            parse_MultiplicationExpression()
        else if (current_token == MINUS)
            parse_MultiplicationExpression()
        else {
            unget_token()
            return
        }
    }
}

inline:

function parse_AdditionExpression() {
    parse_MultiplicationExpression()
    while (has_token()) {
        get_token()
        if (current_token == PLUS)
            parse_MultiplicationExpression()
        else if (current_token == MINUS)
            parse_MultiplicationExpression()
        else {
            unget_token()
            return
        }
    }
}

and you have just to add the semantic processing to get your function.

AProgrammer
  • 3,059
  • 17
  • 20
5

You want to look into LL($k$) parsing. The Wikipedia article is mostly useless, but it's basically recursive descent with $k$ symbols lookahead.

There is also LL($*$) which permits unbounded lookahead.

See here for a comprehensive overview on how powerful this class of parsers is.

Raphael
  • 72,336
  • 29
  • 179
  • 389