4

We know by polynomial-time parsing algorithms like the classical CYK algorithm that $\mathrm{CFL} \subseteq \mathrm{P}$.

Furthermore, it is easy to show by direct simulation that $\mathrm{DCFL} \subseteq \mathrm{P}$ and $\mathrm{CFL} \subseteq \mathrm{NP}$, respectively. We simply use the TM tape as stack, readig and writing only at one end of the used area. The only element we can not immediately translate is transitions which push more than one symbol on the stack; for such we add intermediate states. We get a TM that stays within a constant factor both in terms of size and runtime of the original PDA.

As a follow-up to my overly optimistic answer here (thanks @A.Schulz for exposing my mistake) I wonder: can we combine the two? That is, is there a (more or less) direct simulation from NPDA to DTM?

Naive translations as the above do not work because of the blow-up needed to resolve the nondeterminism. We may be able to make use of some normal forms such as absence of $\varepsilon$-transitions (are there others?), though.

Raphael
  • 72,336
  • 29
  • 179
  • 389

1 Answers1

2

We can combine two known techniques: CYK -which you mention- and the PDA to CFL construction.

Assume that the PDA is in "Chomsky Normal Form": it reads iff it pops, and pushes two symbols otherwise. Formally, its instructions are of the form

  • $(p,\varepsilon,A) \mapsto (q,BC)$
  • $(p,a,A) \mapsto (q,\varepsilon)$

Recall that this can be transformed into CFL using non-terminals of the type $[p,A,r]$, and the productions

  • $[p,A,r] \to [q,B,s][s,C,r]$ for any "guessed" states $r,s$
  • $[p,A,q] \to a$

Note that $[p,A,r]$ is a non-terminal that derives all strings using certain computations (see below).

We do not have to explicitly perform the latter construction, but we can build it in the CYK itself, by introducing a table $P(i,j)$ that contains elements that are such triplets. So this would mean $[p,A,r] \in P(i,j)$ iff there is a computation on $w_i\dots w_j$ starting in state $p$ and ending in state $r$ that starts with $A$ on the stack and ends with an empty stack.

Hendrik Jan
  • 30,578
  • 1
  • 51
  • 105
  • "there is a computation on $w_i \dots w_j$ starting in state p and ending in state r that starts with A on the stack and ends with an empty stack" -- how do you decide this criterion in the presence of $\varepsilon$-loops? – Raphael Jan 05 '17 at 15:35
  • This feels a little bit like cheating: if we hide the fact that we use grammars behind enough implicitly-s, we are okay, right? ;) That said, nice idea! – Raphael Jan 05 '17 at 15:36
  • The $\varepsilon$-loops are no problem here. Every pop reads a letter every other transition adds one to the length of the stack. That means all computations on $k$ input letters have $k$ pops, and thus have $k-1$ instructions increasing length of the stack (as the stack starts with one letter and ends with none). – Hendrik Jan Jan 06 '17 at 12:19
  • Thank you for your comment, but I do not fully agree with the cheating. There is a strong connection between push-down computations and left-most derivations of grammars in ChNF. In a way a PDA is just a CFG in ChNF with states. This is the idea of the triplet construction. It decomposes recursively PDA computations of a certain form. This same decomposition I propose here for the CYK for PDA computations, never explicitly constructing the grammar, but using the concepts. – Hendrik Jan Jan 06 '17 at 12:25
  • Ad loops: Can't we have rules like $(p, \varepsilon, A) \mapsto (p, AB)$? That would allow the stack to blow up -- but I guess the CYK-simulation won't care about that because it will only investigate computations of the form you describe (and show existence of). Correct? – Raphael Jan 15 '17 at 19:12
  • @Raphael Yes, the PDA might nondeterministically `prepare' a stack, which later can be checked. The CYK part will only take as many of these steps as there are symbols in the input, as the stack symbols have to be popped while reading. – Hendrik Jan Jan 15 '17 at 20:57