2

I have read a paper which mentions that CFL-reachability is solvable in exponential time and space. Intuitively, I suppose that one need to explore through all the sub-paths in the PDA for a CFL. However, I am not able to convince myself with this explanation. If someone could provide an explanation, it would be great.

2 Answers2

4

The input for CFL-reachability problem consists of two parts: the grammar and the graph. As far as time complexity should be measured in terms of the input size, we should include both grammar and graph sizes. But in some cases one can think that grammar size is a constant. For example, we do so when working with classical parsing algorithms: we say that LR is linear, CYK is cubic and so on. It is because the classical way is to create a fixed parser for the language $L$ and then apply this parser to different input strings. So, the main question in this case is how the behavior of parser depends on string length. But in CFL-reachability both parts of input are variable. Moreover, in the context of the static code analysis both grammar and graph are produced from code to be analyzed, and sizes of these parts are huge (in the context of graph DB querying graph is huge and grammar is small and this case is similar to programming languages parsing). So, we are interested in combined complexity. And as a result we get that complexity in terms of graph size is cubic (as mentioned in comments) and exponential in terms of grammar size. For more details look at Jelle Hellings. Conjunctive Context-Free Path Queries. So, yes, the combined time complexity of CFL-reachability is exponential.

David Richerby
  • 81,689
  • 26
  • 141
  • 235
gsv
  • 141
  • 1
2

Please refer to the paper Program Analysis via Graph Reachability by Thomas Reps where he mentions clearly that CF-reachability is solvable in cubic time.

I assume CFL-reachability is defined as follows:

A CFL-reachability problem is where a path is considered to connect two nodes only if the concatenation of the labels on the edges of the path is a word in a particular context-free language (an $L$-path):

Definition 2.1 (from paper above). Let $L$ be a context-free language over alphabet $\Sigma$, and let $G$ be a graph whose edges are labeled with members of $\Sigma$. Each path in $G$ defines a word over $\Sigma$, namely, the word obtained by concatenating, in order, the labels of the edges on the path. A path in $G$ is an $L$-path if its word is a member of $L$.

Sarvottamananda
  • 4,817
  • 1
  • 13
  • 19
  • You propose to solve the word problem. Are you sure that "CFL-reachability" is the same? – Raphael Mar 02 '16 at 18:42
  • I suppose mention of CYK was superfluous. – Sarvottamananda Mar 03 '16 at 13:17
  • 2
    This answer is talking about a specific problem: finding a path in a graph from a source node $s$ to a sink node $t$ such that the concatenation of labels on the edges in the path are accepted by a given context-free language $L$. It's not clear whether that is what the original question is asking or not. (Cc: @Raphael) Of course, that's really an issue for the original poster to clarify, not really a criticism of this answer. – D.W. Mar 04 '16 at 00:31