2

Can someone please explain what the discount factor means in the Value Iteration Algorithm for solving Markov Decision Processes?

I understand the equation, but I don't understand why it requires the discount factor (gamma).

logc
  • 2,190
  • 1
    Those who closed this thread has no clue about Reinforcement Learning. Shame. @amon MichaelT Snowman Durron597 ratchet freak – Mehran Jun 15 '17 at 16:50

1 Answers1

2

Here's what I understand: discount factor represents the preference of short-term solutions over long-term solutions.

For example, if I could earn $1 today, I'd value it more than $1 which I could earn tomorrow, and much more than $1 which I could earn on Jan 1, 2050, because random factor change situation more and more as time passes. Discount factor shows how much is today's $1 more valuable than tomorrow's $1.

Since the whole algorithm is about making decisions where the outcome partly depends on random inputs which can drift away over time, invalidating initial decision, it makes sense to prefer decisions which a better as short-term solutions.

scriptin
  • 4,442