Some RL literature use terms such as: 'Bellman backup' and 'Bellman error'. What do these terms refer to?
Asked
Active
Viewed 2,689 times
4

nbro
- 40,472
- 12
- 105
- 192

user529295
- 369
- 2
- 10
-
There's already an answer that addresses both concerns/questions, but, please, next time, focus on one question per post, although, in this case, the terms are highly related (but I still think these "simple" questions could have been asked in separate posts). It may also be a good idea to provide more context (e.g. a link to an article that mentions these terms), although, again, in this case, anyone familiar with RL would be able to understand the question. – nbro Jun 28 '21 at 13:15
1 Answers
3
A Bellman backup is an application of a Bellman operator. For example, the step
$$ V(x)\leftarrow \alpha(R + \mathbf{E}[V(x')]) + (1-\alpha)V(x) $$
Is a Bellman backup for some learning rate $\alpha$.
A Bellman error is
$$ d(V(x), R + \mathbf{E}[V(x')]) $$
for some metric $d$, usually $d(x, y) = (x-y)^2$.

harwiltz
- 1,136
- 1
- 6
- 6
-
-
2It refers to propagating information from later states to earlier ones (backward in time sorta) – harwiltz Jun 28 '21 at 13:02
-
2It may be a good idea to 1. also provide the figures of the backups e.g. that you can find in Sutton and Barto's book, 2. to link the OP to this question about what the Bellman operator is and 3. explain the symbols in your answer. – nbro Jun 28 '21 at 13:12
-