Although it makes some people a bit anxious, apparently it is possible to take a derivative of one expression with respect to another, as demonstrated here. In fact, it is suggested here that an independent variable is often viewed as a function.
Working with the $\frac{d(sin\ x)}{d(cos\ x)}$ example, we could get the correct answer, $-cot\ x$, by taking the controversial derivative notation seriously and dividing the appropriate differential 1-forms:
$$\frac{d(sin\ x)}{d(cos\ x)} \implies \frac{cos(x)\ dx}{-sin(x)\ dx} \implies \frac{cos\ x}{-sin\ x} \implies -cot\ x$$
Does this approach work in general? If so, it suggests an interesting fact about optimization; namely, the "denominator" $dg$ in $\frac{df}{dg} = 0$ is irrelevant, since this approach implies that $\frac{df}{dg} = 0$ is identical to $df = 0$. What really matters for finding critical points then must be what one solves for algebraically after taking the derivative, not what the derivative is with respect to. So if you want to know what value of $x^2 - x$ optimizes $f(x)$, you don't need to find the derivative of $f(x)$ with respect to $x^2 - x$, but rather to take the derivative of $f(x)$ with respect to any $g(x)$, such as $x$, or even to take the differential of $f(x)$, and solve the result for $x^2 - x$. Is this observation correct?