0

What is the motivation for the definition of the binary entropy function $H(x) = -p\log_2(p) - (1-p)\log_2(1-p)$? I understand that we want the entropy to be zero at $p = 0$ and $p = 1$ (no randomness), and at a maximum at $p = \frac 12$ (lots of randomness), but where did the rest of it come from? Why is it not say a quadratic in $p$?

user141592
  • 6,118

1 Answers1

2

Usually one requires certain properties (axioms) for an entropy: Nonnegativity; the bigger the uncertainty, the bigger the entropy; and additivity for independent observations/measurements. The last property implies that there should be a logarithmic dependence.

For more details on the axiomatic formulation of entropy see http://www.math.nyu.edu/faculty/kleeman/infolect1.pdf or http://arxiv.org/pdf/quant-ph/0511171.pdf

MHS
  • 816
  • 5
  • 6