I do know how to calculate the derivative of sigmoid function assuming the input is a scalar. How to properly derive the derivative of sigmoid function assuming the input is a matrix - i.e. using matrix calculus? The fraction (a sort of division) looks weird in there.
Here are my vague ideas (inspired on how it is coded): $$ \sigma(\mathbf{X}) = \frac{1}{1 + \exp(-\mathbf{X})} $$ \begin{split} \mathrm{d} \sigma(\mathbf{X}) & = \frac{-1 \left[ \exp(-\mathbf{X}) \odot \mathrm{d} (-\mathbf{X}) \right]}{\left( 1 + \exp(-\mathbf{X}) \right)^2} =\\ & = \frac{-1 \left[ \exp(-\mathbf{X}) \odot (-\mathbf{1}) \odot \mathrm{d} \mathbf{X} \right]}{\left( 1 + \exp(-\mathbf{X}) \right)^2} = \\ & = \frac{ \mathbf{1} \odot \exp(-\mathbf{X}) \odot \mathrm{d} \mathbf{X} }{\left( 1 + \exp(-\mathbf{X}) \right)^2} = \\ & = \frac{\mathbf{1}}{1 + \exp(-\mathbf{X})} \odot \frac{\exp(-\mathbf{X}) + \mathbf{1} - \mathbf{1} }{1 + \exp(-\mathbf{X})} \odot \mathrm{d} \mathbf{X} =\\ & = \sigma(\mathbf{X}) \odot (\mathbf{1} - \sigma(\mathbf{X})) \odot \mathrm{d} \mathbf{X} \end{split}
The result matches how I would code it yet the derivation just does not seem right.