Questions tagged [parsers]

Questions about algorithms that decide whether a given string belongs to a fixed formal language.

429 questions
6
votes
1 answer

Parsing a string with LR parsing table

Me and my friends are studying for an upcoming test and this exercise is one of the harder ones for us. We have been trying to solve it and we have been looking at similar exercises, the problem is that the explanations are too complicated for us…
Jakob Danielsson
5
votes
1 answer

Why is XML hard to parse?

I was chatting with a friend about my love/hate relationship with XML. He made the comment that, "xml is broken primarily because parsers for recursive self-defined document are basically impossible to get right." I've heard the critique that XML…
Indolering
  • 150
  • 5
3
votes
1 answer

Can Earley parser work in parallel?

Since Earley parser finds all possible application variants for a token, can it parse text in parallel, unlike the usual parser like stack-based, etc. You just need to modify the start of each parallel chunk of tokens, then while going backwards…
2
votes
0 answers

In GLL generated SPPF graph, how does the algorithm produce children with only one parent?

I am working through (1) "GLL Parsing" (2) "GLL parse-tree generation" and (3) "Structuring the GLL parsing algorithm for performance" by Elizabeth Scott and Adrian Johnstone. An example grammar given in (3) 2.1 is S ::= b a c | b a a | b A c A ::=…
Rahul Gopinath
  • 357
  • 1
  • 11
2
votes
1 answer

Strategy to designing grammar for a LR(1) parser

Is it better to think about tokens from right to left and perform right factoring on grammar for an LR(1) parser? As apposed to thinking about tokens left to right and doing left factoring on grammar for an LL(1) parser. Example java import…
clinux
  • 247
  • 1
  • 7
2
votes
1 answer

Tokenizer and complex operators

I'm trying to create simple tokenizer to transform following (only part shown) search expression to tokens word1 near(1) word2 where word1, word2 are some words and near(1) is distance operator. The question is how this expression should be…
Oleg
  • 123
  • 3
2
votes
0 answers

How to generate LL(1) parse table

Given the following grammar: S -> A a A -> B D B -> b B -> ε D -> d D -> ε first would be: B: {b,ε} D: {d,ε} A: {b,ε,d} S: {b,ε,d,a} and follow would be: S: {$} A: {a} B: {d,a} D: {a} The LL(1) parse table should be easy to build. According to…
ikkentim
  • 121
  • 1
1
vote
1 answer

Best Approach To Parse Human Readable Text To Machine Readable (CSV)

I am attempting to parse a large amount of text that cannot be readily used by other software due to the human readable design. However, each "section" of the text has the same format. By "section", I mean lines 1-10 (section 1) will have the same…
rys
  • 113
  • 3
1
vote
0 answers

Convert Source Code into English Text?

Is there a way to convert, i.e., C code into English? The purpose would be to convert source code into a Text-To-Speech converter which would enable someone to "listen" to source code. EDIT: An alternative idea would be to parse C code to be more…
George
  • 285
  • 1
  • 7
1
vote
1 answer

What is a "special sequence" in EBNF?

In Extended Backus-Naur Form (EBNF) there is one form called a "special sequence" which is surrounded by questions marks ?...? What does this mean?
Tyler Durden
  • 688
  • 1
  • 4
  • 14
1
vote
1 answer

Why is particular token missing in LALR lookahead set?

I ran the following grammar (pulled from the dragon book) in the Java Cup Eclipse plugin: S' ::= S S ::= L = R | R L ::= * R | id R ::= L The items associated with state 0 given in the Automaton View are as follows: S ::= ⋅S, EOF S ::= ⋅L = R,…
1
vote
0 answers

How do merged lookahead sets help LALR parse more grammars than SLR?

From what I have searched on Google so far, the only difference between LALR(1) and SLR(1) is that LALR(1) uses states with merged lookahead sets. One source says that LALR(1) won't run into shift-reduce conflicts because the parser will "remember"…
1
vote
1 answer

How is the lookahead set computed for the Earley algorithm?

I read the dissertation [1] and the paper [2], but I'm not sure how to compute $H_{k}$. $H_{k}$ is defined as: $$H_{k}(\gamma) = \{ \alpha | \alpha \text{ is terminal,} |\alpha| = k \text{ and } \exists{\beta} \text{ such that } \gamma…
Augusto Hack
  • 111
  • 3
1
vote
0 answers

Should a lexer (tokenizer) handle unknown operators?

I have a list of supported operators, my question is whether the lexer should just yield the token for the operator or raise a syntax error in case that particular operator (let's say "?") doesn't exist in the operators list? for example, the…
Jonathan1609
  • 111
  • 2
1
vote
1 answer

Can given grammar be parsed using recursive descent?

I have this Grammar below: Expr ::= Term "(" Term ")" Term ::= Ident | "(" Expr ")" Start symbol: Terminal symbol: Ident ::= [a-z]+ And I wonder if it is possible to construct a recursive descent parser for grammar G2? I think this won't…
user135413
1
2