How would I prove the Jacobian matrix is the unique linear transformation for a multivariable function that is total differentiable

Question

I've recently been going back to the basics, and I realized I was never taught the definition of (total) differentiability for multivariable functions.

Instead, I was simply handed a statement for what the total derivative is, and we ran from there.

My goal is to connect the more abstract definition of differentiability to the common statement of the total derivative that we typically see in introductory multivariable calculus courses.

To get started, we need to work with the definition of differentiability.

Everything centres around the total derivative, which is a linear transformation.

Given this definition of differentiability, I was delighted to see how the multivariable chain rule falls out quite nicely (via a proof similar to this one). Prior to this now, I had not been given a formal proof for the multivariable version, yet I had used it my whole life.

I was also able to see that this linear transformation is unique.

For me, the last piece of the puzzle that I haven't quite verified is that the Jacobian is necessarily this linear transformation (in standard coordinates), if such a linear transformation exists (i.e. if the function is totally differentiable).

In fact, this is where courses would start. They would provide this as the definition of the total derivative, rather than starting with the total derivative as defined, and proving it must equal the Jacobian if it exists. Even this Wikipedia article takes this as a starting point, and even uses words like "best linear approximation", which is not how differentiability is really characterized*.

*Actually, I can't say anything about characterization. What I do know from my own experience is that the words "best" and "approximation" can be very confusing without rigorous definitions. I now appreciate this wording better.

So how do I prove that if a function is totally differentiable, then the linear transformation must be its Jacobian?

Here is my attempt, but I would love feedback:

To prove this, I was thinking of applying similar logic to this answer, which is for the single-variable case but reveals a great strategy we can use
To simplify the proof, let's assume the function f has single-valued outputs because otherwise we can just apply logic component-wise
Likewise, we really only need to bother proving this for two variables
Now, the first thing I would do is reduce the problem of determining the unique linear transformation to one coordinate at a time
By only changing one variable at a time, we can write equations that would let us leverage tools from one-dimensional calculus. We would get equations which are the definition of the partial derivatives
This would naturally force that the linear map must be made up of the partial derivatives of f, which in turn forces the map to be equal to the Jacobian
No other transformation can satisfy these one-dimensional equations, so it is unique (but we also already knew that the mapping had to be unique)
Then I suppose my work is done! If there does exist a linear map that satisfies the definition of total differentiability, then it must be the Jacobian, due to the one-dimensional cases that must also be satisfied

However, this almost feels like an accident rather than a proof. Am I missing something?

What spooks me out is we are already out of information (all coordinates of the total derivative have been specified), but we have only tested one direction at a time! If we were to make an arbitrary change in an arbitrary direction, how would we then bound the errors? Simply put, we cannot, by just knowing the Jacobian alone. I suppose that is a task beyond the partial derivatives, and that all we have established here is that if the function indeed ends up being differentiable (through other analysis), then it must have a total derivative equal to the Jacobian.

Whatever the proper proof may be, I have some closing thoughts. I feel that the commonly said phrase "best linear approximation" for the Jacobian can quite easily confuse students.

To my knowledge, the Jacobian is the only linear transformation that can satisfy the limiting properties of the error term required by the definition of differentiability. In that sense, it is the best.

So the meaning of "best" depends on the definition of the derivative, which was a very good definition given all of the wonderful results that follow (even before we relate the total derivative to the Jacobian), such as the chain rule. A different approach to analyzing functions might end up with other conclusions, but I now appreciate just how powerful and special the differentiability approach is.

What was a missing piece for me is that the multivariable derivative ends up completely determined by the partial derivatives due to the one-dimensional sub-cases. All together, it feels like we got lucky that things worked out, given these restrictive sub-cases, and I'm sure pathologies result from this subtly.

With this connection made, we can then do things like this to prove whether a multivariable function indeed is differentiable. Notice how this article actually flipped around the order of things: starting with the partial derivative based definition of the total differential, and then defining differentiability around that. In either case, knowing how partial derivatives relate to the total derivative through the definition of differentiability gives us a wonderful toolkit for determining if multivariable functions are differentiable.

I should probably read it :) I haven't had a chance yet. Many people recommended — mrmagicfluffyman, Mar 14 '22 at 22:21
You might check out my YouTube lectures, linked in my profile. — Ted Shifrin, Mar 15 '22 at 04:22
On "best linear transformation", I like Milo Brandt's answer to another question which is more directly about that. I expect it generalizes to the multivariable case in a straightforward way. — Mark S., Mar 15 '22 at 09:03
"Simply put, we cannot, by just knowing the Jacobian alone" This is exactly correct. Multivariable calculus courses usually provide at least one example where all the partials exist but the function is not (totally) differentiable. — Mark S., Mar 15 '22 at 09:07
Was wondering if anyone wanted to take a peek at my attempt of a better proof on reddit — mrmagicfluffyman, Mar 15 '22 at 12:48
Mark, thank you for posting Milo's answer. It does generalize and I think I've gone through this journey of realizing exactly what it means to be differentiable -- which boils down to the two theorems he cited. — mrmagicfluffyman, Mar 15 '22 at 12:54