0

Given, the SoftMax function:

\begin{equation} p_j = \frac{e^{o_j}}{\sum_k e^{o_k}} \end{equation} which is posted here: Derivative of Softmax loss function

The following are it's derivatives, as posted in the link: \begin{equation} \frac{\partial p_j}{\partial o_i} = p_i(1 - p_i),\quad i = j \end{equation}

and

\begin{equation} \frac{\partial p_j}{\partial o_i} = -p_i p_j,\quad i \neq j. \end{equation}

I just wanted to know HOW we arrived at these derivatives, and have been scratching my head for hours! Another thing I can't understand, along similar lines is how we arrive at this..:

\begin{equation}1+\sum_{j=1}^{M-1}\exp{\{\eta_{j}\}}={\sum_{j=1}^M\exp{\{\eta_{j}\}}}\end{equation}

Much appreciate any answers, thank you!

0 Answers0