Questions about algorithms that decide whether a given string belongs to a fixed formal language.
Questions tagged [parsers]
429 questions
6
votes
1 answer
Parsing a string with LR parsing table
Me and my friends are studying for an upcoming test and this exercise is one of the harder ones for us.
We have been trying to solve it and we have been looking at similar exercises, the problem is that the explanations are too complicated for us…
Jakob Danielsson
5
votes
1 answer
Why is XML hard to parse?
I was chatting with a friend about my love/hate relationship with XML. He made the comment that, "xml is broken primarily because parsers for recursive self-defined document are basically impossible to get right."
I've heard the critique that XML…

Indolering
- 150
- 5
3
votes
1 answer
Can Earley parser work in parallel?
Since Earley parser finds all possible application variants for a token, can it parse text in parallel, unlike the usual parser like stack-based, etc.
You just need to modify the start of each parallel chunk of tokens, then while going backwards…

user8426627
- 93
- 4
2
votes
0 answers
In GLL generated SPPF graph, how does the algorithm produce children with only one parent?
I am working through (1) "GLL Parsing" (2) "GLL parse-tree generation" and (3) "Structuring the GLL parsing algorithm for performance" by Elizabeth Scott and Adrian Johnstone.
An example grammar given in (3) 2.1 is
S ::= b a c | b a a | b A c
A ::=…

Rahul Gopinath
- 357
- 1
- 11
2
votes
1 answer
Strategy to designing grammar for a LR(1) parser
Is it better to think about tokens from right to left and perform right factoring on grammar for an LR(1) parser? As apposed to thinking about tokens left to right and doing left factoring on grammar for an LL(1) parser.
Example java import…

clinux
- 247
- 1
- 7
2
votes
1 answer
Tokenizer and complex operators
I'm trying to create simple tokenizer to transform following (only part shown) search expression to tokens
word1 near(1) word2
where word1, word2 are some words and near(1) is distance operator.
The question is how this expression should be…

Oleg
- 123
- 3
2
votes
0 answers
How to generate LL(1) parse table
Given the following grammar:
S -> A a
A -> B D
B -> b
B -> ε
D -> d
D -> ε
first would be:
B: {b,ε}
D: {d,ε}
A: {b,ε,d}
S: {b,ε,d,a}
and follow would be:
S: {$}
A: {a}
B: {d,a}
D: {a}
The LL(1) parse table should be easy to build. According to…

ikkentim
- 121
- 1
1
vote
1 answer
Best Approach To Parse Human Readable Text To Machine Readable (CSV)
I am attempting to parse a large amount of text that cannot be readily used by other software due to the human readable design. However, each "section" of the text has the same format. By "section", I mean lines 1-10 (section 1) will have the same…

rys
- 113
- 3
1
vote
0 answers
Convert Source Code into English Text?
Is there a way to convert, i.e., C code into English? The purpose would be to convert source code into a Text-To-Speech converter which would enable someone to "listen" to source code.
EDIT: An alternative idea would be to parse C code to be more…

George
- 285
- 1
- 7
1
vote
1 answer
What is a "special sequence" in EBNF?
In Extended Backus-Naur Form (EBNF) there is one form called a "special sequence" which is surrounded by questions marks ?...?
What does this mean?

Tyler Durden
- 688
- 1
- 4
- 14
1
vote
1 answer
Why is particular token missing in LALR lookahead set?
I ran the following grammar (pulled from the dragon book) in the Java Cup Eclipse plugin:
S' ::= S
S ::= L = R | R
L ::= * R | id
R ::= L
The items associated with state 0 given in the Automaton View are as follows:
S ::= ⋅S, EOF
S ::= ⋅L = R,…

npCompleteNoob
- 87
- 4
1
vote
0 answers
How do merged lookahead sets help LALR parse more grammars than SLR?
From what I have searched on Google so far, the only difference between LALR(1) and SLR(1) is that LALR(1) uses states with merged lookahead sets. One source says that LALR(1) won't run into shift-reduce conflicts because the parser will "remember"…

npCompleteNoob
- 87
- 4
1
vote
1 answer
How is the lookahead set computed for the Earley algorithm?
I read the dissertation [1] and the paper [2], but I'm not sure how to compute $H_{k}$.
$H_{k}$ is defined as:
$$H_{k}(\gamma) = \{ \alpha | \alpha \text{ is terminal,} |\alpha| = k \text{ and } \exists{\beta} \text{ such that } \gamma…

Augusto Hack
- 111
- 3
1
vote
0 answers
Should a lexer (tokenizer) handle unknown operators?
I have a list of supported operators, my question is whether the lexer should just yield the token for the operator or raise a syntax error in case that particular operator (let's say "?") doesn't exist in the operators list?
for example, the…

Jonathan1609
- 111
- 2
1
vote
1 answer
Can given grammar be parsed using recursive descent?
I have this Grammar below:
Expr ::= Term "(" Term ")"
Term ::= Ident | "(" Expr ")"
Start symbol:
Terminal symbol: Ident ::= [a-z]+
And I wonder if it is possible to construct a recursive descent parser for grammar G2?
I think this won't…
user135413