2

In differential geometry we identify the operator $\partial/\partial x$ with a vector and $dx$ with a 1-form. We're generally taught that a vector is a directed line segment (at least, in the simplest conception, though obviously the term ``vector'' can apply to other things as we get more general), and a 1-form can be imagined as a set of lines like contour lines. So a vector has a magnitude measured as a length, and a 1-form has a magnitude measured as a density.

And here's where my intuition rebels... It seems to me that even though $dx$ is not simply a ``very small'' version of $\Delta x = x_2 - x_1$, it is similar enough that is should be a kind of length, just as $\Delta x$ is. And likewise $\partial/\partial x$ should be a kind of density, jut as the spacing of contour lines is. After all, if $g$ is a scalar then $\partial g/\partial x$ tells us how much change in $g$ is crammed into an infinitessimal displacement along the $x$-axis. But by this reasoning $dx$ should be a vector and $\partial/\partial x$ should be a 1-form.

So why are these quantities identified in the opposite way to what makes intuitive sense to me? Is there a reason that I'm overlooking, or is it just arbitrary which one you call a vector and which you call a 1-form and once-upon-a-time someone decided on a convention that makes my head hurt? Every discussion of this that I've looked up seems to become a matter of circular reasoning (e.g. you can argue in terms of one choice transforming contravariantly with respect to the basis vectors, and the other transforming covariantly, but this ultimately depends on defining the basis vectors to be of the form $\partial/\partial x$ in the first place.)

Thanks in advance

  • Tldr; Here an answer to this classic question. If did not get what you are asking more focus please. – Kurt G. Oct 27 '22 at 08:50
  • I think the answer is that $\partial/\partial x_i$ and its sisters are more naturally associated to the tangent vector at a curve $t\mapsto\gamma(t)$. Just take the total derivative of $f(\gamma(t))$ to see this. The one-form $dx_j$ is dual to all the $\partial/\partial x_i$ - so if you want to imagine $dx_j$ as a vector why not ? – Kurt G. Oct 27 '22 at 09:25
  • I strongly recommend Introduction to Smooth Manifolds, especially the chapter the cotangent bundle (chapter $11$, if I remember correctly). Lee explains that, given a function $f$ from your manifold to $\mathbb{R}$, it doesn't really make sense to treat the tuple $(\dfrac{\partial f}{\partial x_1}, ..., \dfrac{\partial f}{\partial x_n})$ (which is our usual gradient in $\mathbb{R}^n)$ as a vector, because it doesn't really behave like one. This is how he motivates the definition for the differential $df$; he then goes on to give a geometric interpretation of how $1$-forms work using integrals. – Azur Oct 27 '22 at 12:05
  • Welcome to Math.SE! <> There are a number of questions and answers on this site concerning the geometric interpretation of vector fields and one-forms, their transformation properties under change of coordinates versus the ways velocity and momentum transform (which sounds like your question), and so forth. It seems likely at least some of these duplicate your question. If not, then because one goal of MSE is to build a non-redundant database of Q/As, it would help if you could link to similar questions and answers that do not fully answer your question and indicate why. – Andrew D. Hwang Oct 27 '22 at 12:18

2 Answers2

3

Let’s look at the object $\Delta x$ more closely. It tells us the change in position. Slightly more explicitly, what does this mean? It means you take as input two points $p,q\in\Bbb{R}^n$ and you output their difference $q-p\in\Bbb{R}^n$. So, this is a mapping $\Delta x:\Bbb{R}^n\times\Bbb{R}^n\to\Bbb{R}^n$. Similarly, you can consider $\Delta x^i$ which is supposed to give the change in $i^{th}$ coordinate of two different points. What kind of object is this? It is a function $\Delta x^i:\Bbb{R}^n\times\Bbb{R}^n\to\Bbb{R}$, $(\Delta x^i)(p,q)= q^i-p^i$.

So, $\Delta x$ by itself is not a vector, and $\Delta x^i$ by itself is not a number. Rather it is once we fix two points $p,q$, that the output $(\Delta x)(p,q)=q-p$ and $(\Delta x^i)(p,q)=q^i-p^i$ which are vectors/numbers respectively.

Slightly altering the perspective, rather than thinking of $\Delta x$ as a function of two vector variables and outputting a vector, i.e a map $\Bbb{R}^n\times\Bbb{R}^n\to\Bbb{R}$, we can think of it as a map $\Bbb{R}^n\to (\Bbb{R}^n\to\Bbb{R})$, defined as $p\mapsto (\Delta x)(p)\equiv (\Delta x)_p = [q\mapsto (\Delta x)_p(q):= q-p]$. i.e rather than a function of two variables, you think of it as a function of one variable which outputs a function, that output itself being a function of one variable. Said another way, you’re freezing one of the inputs: $p\mapsto \Delta x(p,\cdot)$.

We can do the same thing with $\Delta x^i$. This shall be an object which first of all eats a point $p$ of interest; this gives us the object $(\Delta x^i)_p$. However this is not yet a real number! We must feed it another input $q$, and only then do we get the real number $(\Delta x^i)_p(q)=q^i-p^i$, which is the difference in the $i^{th}$ coordinates of the point. So, $\Delta x^i$ is the mapping $p\mapsto (\Delta x^i)_p$, but $(\Delta x^i)_p$ is itself a function. The confusion you’re having is because you’re conflating functions (which are ‘rules’) with function values (which are elements of the target space), and that’s why you’re thinking $\Delta x$ is a vector and $\Delta x^i$ is a number, when in reality, once you write explicitly the arguments it becomes clear that they are maps of certain types.

So, by this logic, the symbol $dx$ (on $\Bbb{R}^n$) can be thought of as an $\Bbb{R}^n$-valued $1$-form, and $dx^i$ is a usual $1$-form. The only conceptual leap in going from $\Delta x$ or $\Delta x^i$ to $dx$ or $dx^i$ is that in the latter case, they take in a point $p$ as before to give $(dx_p)$ or $(dx^i)_p$, but now they must be fed a tangent vector $\xi\in T_pM$ and only then do they produce a vector/number respectively. The whole ‘infinitesimalness’ of $dx$ or $dx^i$ is now reflected in the fact that the second argument is no longer a point $q\in\Bbb{R}^n$ as above, but rather a tangent vector $\xi\in T_pM$ (where $M$ is the underlying manifold… you can take it to be $\Bbb{R}^n$ if you wish).

Therefore, $dx^i$ really should be a differential 1-form, especially after considering the role played by $\Delta x^i$.


Conversely, derivatives $\frac{\partial}{\partial x^i}$ are meant to capture information about how a given function $f$ changes in the $x^i$ direction. i.e differential operators contain information about direction, and so they are better thought of as describing tangent vectors (though I personally prefer the definition as an equivalence class of curves, since that’s much more pictorial).

peek-a-boo
  • 55,725
  • 2
  • 45
  • 89
  • Having spent a day or so digesting this answer I cannot thank you enough. You are completely right when you said I was conflating functions with function values. And I wasn't realising it. This is a beautiful answer! Thank you for writing it out and clarifying the essence of the problem. Thank you especially for dealing with both $dx$ and $\partial/\partial x$ and their identification as mappings. You've cleared up something that's been driving me nuts for a long time. Thank you thank you thank you! – StarWombat Oct 31 '22 at 03:05
  • @StarWombat I’m glad this was helpful (and likewise this is why derivatives of functions $df$ are described as covectors). And yes functions vs function values is something which is unfortunately always mixed up (in multiple areas of math and physics), because often people who understand the concepts find it so obvious they feel it’s unnecessary to elaborate. – peek-a-boo Oct 31 '22 at 03:27
2

Here is a heuristic version of my understanding of the subject.

In differential geometry, vectors are designed to capture the notion of tangent vectors. That is, they represent the rescaled version of infinitesimal displacements.

To see this, suppose we are given an embedded $d$-manifold $M$ (in $\mathbb{R}^n$), and let $p$ be a point in $M$. Then heuristically, the space $T_p(M)$ of all tangent vectors to $M$ at $p$ is the limit of rescaled displacements:

$$ T_p(M) \mathrel{“=”} \lim_{\varepsilon \to 0}\left\{ \frac{p' - p}{\varepsilon} : \text{$p' \in M$ is within $\mathcal{O}(\varepsilon)$-distance to $p$} \right\} $$

In other words, if we zoom in on $M$ around the point $p$, then $M$ will look like a hyperplane which we call the tangent space to $M$ at $p$. (This heuristics is spiritually the same as the rigorous definition via equivalence classes of 'velocity vectors' at $p$.)

This above heuristic definition, however, makes use of the embedding of $M$ which is not ideal for developing a general language of manifolds. Instead, we want a description of $T_p(M)$ that does not rely on embeddings. To this end, assume we are still working with an embedded manifold $M$, and let $f$ be a smooth function defined near $p$. If $\xi = (p' - p) / \varepsilon$ is an (approximate) tangent vector, then the change in value of $f$ from $p$ to $p'$ is

$$ \Delta f = f(p') - f(p) = f(p + \varepsilon \xi) - f(p) \approx \varepsilon \xi \cdot \nabla f. $$

This way, a tangent vector $\xi$ can be identified with the differential operator $\xi \cdot \nabla$. And the good thing is that the notion of differential operator can be defined for any smooth manifolds. (Saying differently, we are identifying an infinitesimal displacement $\varepsilon \xi$ with the infinitesimal translation along $\varepsilon \xi$.)

I think that the formal definition of tangent vector is confusing you (and was confusing me as well) because it generalizes the operator $\xi \cdot \nabla$, where in fact the "true" tangent vector is $\xi$.

Sangchul Lee
  • 167,468
  • Great explanation. Minor typo: with this point of view, tangent vectors do not live in an hyperplane, but in a linear subspace of dimension $d$ in $\Bbb R^n$. – Didier Oct 27 '22 at 14:53
  • Thank you for the concise and clear answer. I appreciate it,however I feel like it actually reinforces my existing perspective. Because you identify $\xi = (p'-p)/\epsilon$ as the tangent vector, which makes sense to me because it is an approximation to the tangent vector to the path between the points $p'$ and $p$. But to me, the natural notation would be to say $\xi \equiv \Delta p$, which approximates $dp$ better and better as $p'$ approaches $p$. And you say $\xi$ is the ``true'' tangent vector. So it still seems natural to me to call $\nabla$ a density (form), and $dx$ a vector. – StarWombat Oct 29 '22 at 12:07