I am trying to use chain rule in derviating the loss for the Softmax function, but i stuck. In this great answer: I can't realize, why the log(p_k) derivative with respect to o is 1/p_k but not 1/(p_k * ln(10))?
Asked
Active
Viewed 444 times
0
-
In the post you've referenced, the natural logarithm is being used, so you don't need the conversion factor for a base-10 log. Keep in mind that in many (most?) branches of applied math and science, the natural logarithm is denote as $\log(x)$. It's mainly in undergraduate courses where the ${\rm ln}(x)$ notaton is encountered. – greg Feb 25 '17 at 15:50
-
@greg, thanks a lot! – ichernob Feb 26 '17 at 17:21