Gradient is only the vector pointing at the steepest slope. Or it can be said as the change in the function due to a small change in x, in the direction of x + the change in the function due to a small change in y, in the direction of y, + the change in the function due to a small change in z, in the direction of z.
In the spherical co-ordinate system, what's wrong with saying that the gradient is the change in the function due to a small change in r, in the direction of r + the change in the function due to a small change in theta, in the direction of theta + the change in the function due to a small change in phi, in the direction of phi.
Wouldn't that point in the direction of steepest ascent as well?
(P.S- I know what the actual formula is, and I also understand how it's derived but I just don't understand why it's done that way!)