7

While reading the DQN paper, I found that randomly selecting and learning samples reduced divergence in RL using a non-linear function approximator (e.g a neural network).

So, why does Reinforcement Learning using a non-linear function approximator diverge when using strongly correlated data as input?

nbro
  • 40,472
  • 12
  • 105
  • 192
강문주
  • 71
  • 2

1 Answers1

2

It is not so much the problem of using Reinforcement Learning to train the neural networks, it is the assumptions made about the data given to standard Neural Networks. They are not capable of handling strongly correlated data which is one of the motivations for introducing Recurrent Neural Networks, as they can handle this correlated data well.

David
  • 4,889
  • 1
  • 8
  • 28