1

In Colah's blog, he explain this.

In theory, RNNs are absolutely capable of handling such “long-term dependencies.” A human could carefully pick parameters for them to solve toy problems of this form. Sadly, in practice, RNNs don’t seem to be able to learn them. The problem was explored in depth by Hochreiter (1991) [German] and Bengio, et al. (1994), who found some pretty fundamental reasons why it might be difficult.

In quick explanation what is the fundamental reasons why it might be difficult?

Noah Weber
  • 5,669
  • 1
  • 12
  • 26
abdoulsn
  • 165
  • 1
  • 5

1 Answers1

1

There is no explicit notion of memory (like gates in lstm and gru)

Gates are a way to optionally let information through, ommiting this functionality we will just be updating weights that will in process fade away hence longer memory is hard to learn.

Noah Weber
  • 5,669
  • 1
  • 12
  • 26