This question is a follow-up of one of my previous questions: Optimizing a vector equation
Let $x$ and $b$ be two vectors of real numbers in k dimensional space.
Let $W$ be a k-by-k matrix of real numbers representing a transformation.
Let $\alpha$ be a scalar value.
We are looking for the optimal alpha, which minimizes the squared Mahalanobis distance between x and the multivariate distribution, $ N(b, \Sigma) $ scaled by $\alpha$:
$$ D = (x - \alpha b)^T (\alpha W)^{-1} (x - \alpha b) $$
The Mahalanobis distance:
$$ D = (x - \mu) \Sigma^{-1} (x - \mu) $$
As it can be seen, in our case, $\mu = \alpha b$ and $\Sigma = \alpha W$
I took the derivative of the Mahalanobis distance, according to Differentiating mahalanobis distance and applied the chain rule to get to $\alpha$:
$$ \eqalign { \frac {dD} {d \alpha} &= \frac {dD} {d \mu} \frac {d \mu} {d \alpha} + \frac {dD} {d \Sigma} \frac {d \Sigma} {d \alpha} \cr &= -2 \Sigma^{-1}(x - \mu) b - \Sigma^{-1} (x - \mu) (x - \mu) \Sigma^{-1} W \cr &= 0 \cr 2 \Sigma^{-1}(x - \mu) b &= -\Sigma^{-1} (x - \mu) (x - \mu) \Sigma^{-1} W}$$
This expression is reduced by multipliing by $\Sigma$ from the left and by expanding the $\Sigma^{-1}$ term on the rightmost side, getting
$$ 2 (x - \mu) b = - (x - \mu) (x - \mu) \alpha^{-1} $$
Expanding the $\mu$ terms on both sides and expanding the parentheses yields
$$ \eqalign { 2xb - 2 \alpha bb &= -(xx - 2 \alpha xb + \alpha^2 bb) \alpha^{-1} \cr 2xb - 2 \alpha bb &= -\alpha^{-1}xx + 2xb - \alpha bb}$$
This can be simplified by adding $2xb + \alpha bb$ to both sides and multipliing by -1 getting
$$ \eqalign { \alpha bb &= \alpha^{-1} xx \cr \alpha &= \sqrt { \frac {||x||^2} {||b||^2} } \cr \alpha &= \frac {||x||} {||b||}}$$
I have the strong intuition that I went wrong somewhere. Could someone please check my calculations?