3

In deep learning, the concept of validation loss is to ensure that the model being trained is not currently overfitting the data. Is there a similar concept of overfitting in deep q learning?

Given that I have a fixed number of experiences already in a replay buffer and I train a q network by sampling from this buffer, would computing the validation loss (separate from the experiences in the replay buffer) help me to decide whether I should stop training the network?

For example, If my validation loss increases even though my train loss continues to decrease, I should stop training the training. Does deep learning validation loss also apply in the deep q network case?

Just to clarify again, no experiences are collected during the training of the DQN.

nbro
  • 40,472
  • 12
  • 105
  • 192
calveeen
  • 1,261
  • 8
  • 18

0 Answers0