Please forgive me for the implicity of the question, as I recently started studying Reinforcement Learning. I am supposed to study a system where the transition probabilities are known and I have to use Reinforcement Learning.
I try to unterstand the relation between Dynamic Programming and Reinforcement Learning.
Although I have studied several lectures, read the corresponding chapters of Barto & Sutton's book and watched some videos, it is still not clear if Dynamic Programming (specifically value iteration, policy iteration) is distinct from Reinforcement Learning or if policy iteration and value iteration are considered model-based algorithms of Reinforcement Learning.
Apart from this, I wonder if there is any sense using Reinforcement Learning when transition probabilities are known.
Thank you!