Echo state networks are theoretically equivalent to DFAs/NFAs, but how would you use an ESN to parse a regular language? Would you just feed many different input strings, some from the language and some not, and then train it to output one or zero to indicate whether the string belongs to the language?
-
1Can you [edit] your post to provide some references or links to a technical introduction to echo state networks, and to the proof of equivalence? – D.W. Dec 16 '22 at 23:47
1 Answers
Depending upon your goals, learning from example inputs might not be the best approach. It might work fine in many cases, but there are some hard languages where it will fail.
In particular, it is known that learning a regular language from a finite set of examples is NP-hard. See, e.g., https://cstheory.blogoverflow.com/2011/08/on-learning-regular-languages/, Smallest DFA that accepts given strings and rejects other given strings, https://cstheory.stackexchange.com/q/1854/5038, https://cstheory.stackexchange.com/q/27971/5038, https://cstheory.stackexchange.com/q/9807/5038, https://cstheory.stackexchange.com/q/27347/5038. It follows that there exist some regular languages where feeding in positive and negative examples and trying to train from there thus is doomed to fail.
You mention theoretical equivalence. I suspect that is true if you are allowed to choose all of the weights of the network, including those governing the hidden states. There are standard theorems saying that neural networks are universal function approximators, so a natural approach is to have the hidden state be a suitable encoding of the state of the DFA, and then select weights that make the neural network's computation of the next hidden state match the transition function of the DFA. The exact details of how you do that will depend on the exact nature of the internal architecture of the echo state network and what activation function is used, but is typically straightforward.
For more details of the universal function approximator theorem, I recommend the textbook Neural Networks and Deep Learning, specifically Chapter 4. If you work through that chapter, I suspect you will see how to prove the equivalence theorem and fill in the weights to make the ESN behave in a way that is equivalent to a DFA.
Or, you could consult the proof of the equivalence claim you relate and follow through their reasoning.

- 159,275
- 20
- 227
- 470