It is easy to see that $D(||x||^2)(x) = 2x^T$, where $D$ denotes the (total) dervative. The gradient is the transpose of the derivative. Also $D(Ax + b)(x) = A$. By the chain rule, $Df(x) = 2(Ax - b)^TA$. Thus $\nabla f(x) = Df(x)^T = 2A^T(Ax - b)$.
To compute $Dg(x)$, it will be helpful to first compute $D(||x||)(x)$. By the chain rule, \begin{align}
D(||x||)(x) &= D(\sqrt{||x||^2})(x) \\
&= \frac{1}{2}(||x||^2)^{-1/2}2x^T \\
&= \frac{1}{||x||}x^T.
\end{align}
Now the derivative of $g$ is easy to obtain using the chain rule: $Dg(x) = \frac{1}{||y - Ax||}(y - Ax)^T(-A)$. So $\nabla g(x) = -\frac{1}{||y - Ax||}A^T(y - Ax)$.
Edit: I guess you meant $f(x) = (Ax - b)^TR^{-1}(Ax - b)$. To compute the derivative of $f$, it is convenient to first compute the derivative of $q(x) = x^TBx$, where $B$ is any matrix. For any vector $y$ we have
\begin{align}
q(x + y) &= (x + y)^TB(x + y) \\
&= (x^T + y^T)(Bx + By) \\
&= x^TBx + x^TBy + y^TBx + y^TBy \\
&= q(x) + x^TBy + (Bx)^Ty + O(|y|^2) \\
&= q(x) + x^T(B + B^T)y + O(|y|^2).
\end{align}
Hence $Dq(x) = x^T(B + B^T)$. Note with $B = R^{-1}$, $f(x) = q(Ax - b)$. Hence by chain rule, $$Df(x) = (Ax - b)^T(R^{-1} + (R^{-1})^T)A.$$ Taking the transpose gives $$\nabla f(x) = A^T((R^{-1})^T + R^{-1})(Ax - b).$$