What is a pullback of a metric, and how does it work?

Question

The term "metric" is familiar, but not the idea of a pullback on it. I have tried to find intuitive, beginner-friendly explanations of this concept without success. Your attempts would be appreciated. Pictures and concrete examples would be wonderful, if possible.

I have not studied much topology or differential geometry before, but know some really early engineering/physics math (linear algebra, multivariate and vector calculus etc.) Analogies to these areas would be great.

In DoCarmo's book on differential forms there is a very illuminating example: pullback is changing variables. So, for example, if you have $dx$ and you write $x=r\cos t, y=r\sin t$ and $dx=\cos t, dr -r\sin t,dt$ you are doing a pullback. (This is not a metric, but the concept is the same). — Giuseppe Negro, Jul 12 '18 at 20:49
Is your question about Riemannian metrics, or is it about distance functions? — Giuseppe Negro, Jul 12 '18 at 21:23
It was intended to be about Riemannian metrics, but I may be too ignorant to judge how important distance functions are to the topic. Thanks for the illuminating (indeed!) comment. — Tensor McTensorstein, Jul 12 '18 at 21:30

score 17 · Accepted Answer · answered Jul 12 '18 at 21:08

Suppose that you have two spaces $X$ and $Y,$ a metric $d$ on $Y$, and a function $f : X \to Y.$ The pullback metric is the following metric on $X$: $$(f^*d)(x^{(1)}, x^{(2)}) = d(f(x^{(1)}), f(x^{(2)})); \quad x^{(1)}, x^{(2)} \in X$$

Thus, we define a metric on $X$ by mapping points over to $Y$ and taking the distance there.

One example is given by considering different coordinate systems. Let $E = \mathbb R^2$ be the plane with ordinary Euclidean distance, $$d_E((x_1, y_1), (x_2, y_2)) = \sqrt{(x_1-x_2)^2 + (y_1-y_2)^2}.$$

Let $P = [0, \infty) \times [0, 2\pi),$ the domain of polar coordinates. Define $f : P \to E$ by $f(r, \theta) = (r \cos\theta, r \sin\theta).$ You probably recognize this as the mapping from polar coordinates to Cartesian coordinates. The pullback metric $d_P := f^*d_E$ is then $$d_P((r_1, \theta_1), (r_2, \theta_2)) = \sqrt{(r_1 \cos\theta_1 - r_2 \cos\theta_2)^2 + (r_1 \sin\theta_1 - r_2 \sin\theta_2)^2.}$$

This is the distance between two points in a plane, given in polar coordinates.

I believe that it is called pullback since we pull the metric from the codomain of $f$ back to the domain of $f$.

I think we should also imppose the condition that $f$ is one-one.. — Praphulla Koushik, Oct 27 '19 at 09:15
The didactic value of this explanation is without parallel. Very well done! — starseed_trooper, Apr 19 '23 at 03:22

Rolf Alien · Answer 2 · 2023-11-01T14:53:08.923

This post is over 4 years old by now, so I'm guessing the OP has moved on with their life. But seeing as this post was the first result when I googled "pullback metric", and it has amassed over 8k views, I figure an answer may still be useful to someone else.

Both answers, including the accepted one, seem to talk about metrics as in distance functions, whereas the OP seems to have intended the question to be about Riemannian metrics. First of all, if you're unaware, these are two completely different notions, that are unfortunately both commonly called "metrics". A Riemannian metric is a choice of inner product on each tangent space of a smooth manifold. A distance function is a measure of distance in a set. A Riemannian metric in fact induces a distance function on a manifold (just like an inner product induces a norm on a vector space), but they are not at all the same thing (just like a norm and an inner product are not).

To be more precise, a distance function on a set $X$ is a map $d : X\times X \to X$ that is symmetric ($d(x,y) = d(y,x)$), nonnegative ($d(x,y) \geq 0$, with equality iff $x = y$), and satisfies the triangle inequality $d(x,z) \leq d(x,y) + d(y,z)$.

A Riemannian metric on the other hand is a covariant 2-tensor field on a smooth manifold $M$. That is, it is an element $g$ of the space of sections $\Gamma(T^{(0,2)}M)$ of the bundle $T^{(0,2)}M$ (where the $(0,2)$ stands for "twice covariant"). Pointwise, this means that the metric $g\in\Gamma(T^{(0,2)}M)$ at each point is a bilinear map $$g_p : T_pM\times T_pM \to \mathbb{R}.$$ (To be more precise, $g$ can be thought of as acting on points $p$ and returning a covariant 2-tensor (multilinear map) $g_p$, or it can be thought of as acting on pairs of vector fields, with the pointwise action given by $g_p$.) At each point $p$, the map $g_p$ is assumed to be an inner product on the tangent space. That is, $g_p$ is symmetric ($g_p(X,Y) = g_p(Y,X)$), nonnegative ($g_p(X,X) \geq 0$, with equality iff $X = 0$), and, as previously mentioned, linear in both arguments.

If the nonnegativity requirement is dropped, and we instead only assume that $g$ is nondegenerate ($g(X,Y) = 0$ for all $Y$ implies that $X = 0$), then we call the metric pseudo-Riemannian. For example, Lorenzian metrics (of which the Minkowski metric from special relativity is an example) are pseudo-Riemannian, because light-like vectors have norm 0, and time-like vectors have negative norm (this, indeed, defines them). Everything I say here works out the same for pseudo-Riemannian metrics.

From now on, I will use "metric" to mean "Riemannian metric". So, back to the question: If $F : M\to N$ is a smooth map between manifolds, and $N$ carries a metric, how does $F$ induce a metric on $M$?

Intuition

Let $M$ and $N$ be smooth manifolds, and suppose $F : M \rightarrow N$ is a smooth map. The idea is that if $N$ has some "geometry" (in the form of a Riemannian metric), we can pull back that geometry via $F$ to define a geometry on $M$. A good example to keep in mind here is when $N$ is $\mathbb{R}^3$, $M$ is the unit sphere

$$\mathbb{S}^2 = \{(x,y,z)\in\mathbb{R}^3 : x^2 + y^2 + z^2 = 1\},$$

and $F$ is the inclusion map $\iota : \mathbb{S}^2 \hookrightarrow \mathbb{R}^3$. (The inclusion map $\iota$ maps each point on the sphere to itself, but considered as a point in $\mathbb{R}^3$ rather than $\mathbb{S}^2$.) The ambient space $\mathbb{R}^3$ has a standard geometry given by the Euclidean Riemannian metric

$$g_{\mathbb{R}^3} = \mathrm{d}x^2 + \mathrm{d}y^2 + \mathrm{d}z^2.$$

Or in components with respect to the identity chart on $\mathbb{R}^3$:

$$(g_{\mathbb{R}^3})_{\alpha\beta} = \delta_{\alpha\beta}.$$

The geometric idea is that the sphere does not á priori carry a geometry, considered only as a smooth manifold (we can rescale and distort it quite a lot without doing damage to the smooth structure). But considered as a submanifold of $\mathbb{R}^n$ we think of the sphere as quite rigid. This is because the geometry of the ambient space $\mathbb{R}^3$ induces a geometry on $\mathbb{S}^2$. If we consider the coordinate axes in $\mathbb{R}^3$ as rigid, then the sphere will also be rigid inside it. In other words, it "inherits" a geometry from $\mathbb{R}^3$. This is formalised as saying that $\mathbb{S}^2$ carries the pullback metric induced by the metric on $\mathbb{R}^3$ and the inclusion map $\iota : \mathbb{S}^2 \hookrightarrow \mathbb{R}^3$.

In the general case, think of it as though the map $F : M\rightarrow N$ yields an "image" of $M$ inside of $N$, and we can pull back the geometry on $N$ via $F$ to yield a geometry on $M$.

Definition

Formally, if $g$ is a Riemannian metric on $N$, we define the pullback metric $F^*g$ (pointwise) on $M$ by the formula

$$(F^*g)_p(X,Y) = g(\mathrm{d}F_p(X),\mathrm{d}F_p(Y)).$$

Here $p\in M$ is a point, $X,Y\in T_pM$, and $\mathrm{d}F_p : T_p M\rightarrow T_p N$ is the differential (aka pushforward) of $F$. As in general in differential geometry, it is a good idea to check that everything in this formula acts on something it is allowed to act on. It is also a good exercise to check that this indeed defines a Riemannian metric on $M$.

Edit: While this formula always gives something well-defined, the result is not guaranteed to actually be a Riemannian metric even if $g$ is, unless we require that $\mathrm{d} F_p$ is injective. If $\mathrm{d}F_p$ is noninjective, then $\mathrm{d}F_p(X) = 0$ for some $X\neq 0$, and subsequently $F^*g(X,X) = 0$, violating the positivity requirement. Thank you to @ctst for pointing this out!

A formula for the pullback metric

Suppose we choose local coordinate charts $\xi : U\to\mathbb{R}^m$ and $\phi : V\to\mathbb{R}^n$ on $M$ and $N$ respectively. In our example of the sphere inside $\mathbb{R}^3$, we could for example take $\phi = \mathrm{Id}$ to be the identity chart on $\mathbb{R}^3$, and $\xi$ to be spherical coordinates:

$$\xi^{-1}(\theta,\varphi) = (\sin\theta\cos\varphi, \sin\theta\sin\varphi, \cos\theta).$$

(Note that we can give this map explicitly because $\mathbb{S}^2$ is defined as a subset of $\mathbb{R}^3$. The points in the target of $\xi^{-1}$ are the actual points on the manifold $\mathbb{S}^2$. In general such formulas are very hard to come by -- instead one settles for working in coordinates.)

The general formula (that you will find in many physics textbooks) is then

$$ (F^*g)_{\mu\nu} = g_{\alpha\beta}\frac{\partial \phi^\alpha}{\partial \xi^\mu}\frac{\partial \phi^\beta}{\partial \xi^\nu}.$$

Einstein summation is implied. In our example, this means that we need to differentiate the formulas for spherical coordinates and plug them into the formula, with $g_{\alpha\beta} = \delta_{\alpha\beta}$. A calculation gives

$$g_{\mathbb{S}^2} = (\iota^*g_{\mathbb{R}^3}) = \mathrm{d}\theta^2 + \sin^2\theta\mathrm{d}\varphi^2.$$

Formalism (and a more correct formula)

I was (deliberately) a bit imprecise with some of the steps above, because the formula for the pullback metric that I gave actually requires some interpretation. If the $\xi^\mu$ are coordinate maps, how do we differentiate with respect to them? If they are to be interpreted as the actual coordinates on $\mathbb{R}^m$, then shouldn't we do the same with the $\phi^\alpha$? How do we differentiate coordinates on $\mathbb{R}^n$ with respect to coordinates on $\mathbb{R}^m$?

As I hope the example of the sphere in $\mathbb{R}^3$ makes clear, these formal issues are in practice superficial for calculations, once you "know how it's done". But as mathematicians, we probably want to be clear about what's actually happening (at least, I do). In practice, we never work with the actual function $F$ (in our case, the inclusion map $\iota$) when actually computing in charts, instead we work with the coordinate representation of $F$, the map $$\hat{F} := \phi \circ F \circ \xi^{-1} : \mathbb{R}^m \supset \xi(U) \to \mathbb{R}^n.$$ If you'll remember, the defining formula for the pullback metric relies on the pushforward of a vector, $\mathrm{d} F_p(X)$. In order to calculate the components of the metric then, we need to calculate the components of the differential, $\mathrm{d}F_p$. But these are just the components of the Jacobian of $\hat{F}$! In fact, that's the entire point of the differential.

In (quite a bit) more detail (following the definitions and conventions in John M. Lee's Introduction to Smooth Manifolds): Take a coordinate basis vector $\left.\frac{\partial}{\partial \xi^\mu}\right|_p \in T_p M$. Its image under the differential $\mathrm{d} F_p$ is a vector in $T_{F(p)}N$, so we may expand it as $$\mathrm{d} F_p\left(\left.\frac{\partial}{\partial \xi^\mu}\right|_p\right) = a_\mu^\alpha\left.\frac{\partial}{\partial \phi^\alpha}\right|_p,$$ for some real coefficients $a^\alpha_\mu$. Recall here that $\left.\frac{\partial}{\partial \xi^\mu}\right|_p$ is defined by $$\left.\frac{\partial}{\partial \xi^\mu}\right|_p := \mathrm{d}\xi^{-1}_{\hat{p}}(\partial_\mu|_{\hat{p}}),$$ where $\hat{p} = \xi(p)\in\mathbb{R}^m$ is the coordinate representation of the point $p\in M$, and $\partial_\mu$ is the $\mu$:th partial derivative vector field on $\mathbb{R}^m$; and similarly for $\left.\frac{\partial}{\partial \phi^\alpha}\right|_p$. (I think it's worth taking a second to reflect on the fact that $\partial_\mu|_{\hat{p}}$ is canonically defined on $\mathbb{R}^m$, whereas $\left.\frac{\partial}{\partial \xi^\mu}\right|_p = \mathrm{d}\xi^{-1}_{\hat{p}}(\partial_\mu|_{\hat{p}})$ is defined with respect to the particular chart $\xi$, which is why I think it's good practice in differential geometry to reserve the short-form notation $\partial_\mu$ for partial derivatives on $\mathbb{R}^m$.) So, we need to calculate the coefficients $a_\mu^\alpha$.

Now, maps into $\mathbb{R}^n$ can be split into components. If $$\pi^\alpha : \mathbb{R}^n\to\mathbb{R}, \qquad \pi^\alpha(y^1,\ldots,y^n) = y^\alpha, \qquad \alpha = 1,\ldots,n$$ are the canonical projection maps, we may introduce the component maps $\phi^\alpha := \pi_\alpha\circ\phi$, and $\hat{F}^\alpha = \pi_\alpha\circ\hat{F}$. It is a standard fact that the differentials of the components of a chart map form a dual basis to the corresponding coordinate basis for $T_p M$. That is, $$\mathrm{d}\phi^\alpha_{q}\left(\left.\frac{\partial}{\partial \phi^\beta}\right|_q\right) = \mathrm{d}\pi^\alpha_{\hat{q}}\circ\mathrm{d}\phi_q\left(\left.\frac{\partial}{\partial \phi^\beta}\right|_p\right) = \mathrm{d}\pi^\alpha_{\hat{q}}(\partial_\beta|_{\hat{q}}) = \delta_{\beta}^\alpha,$$ where $\hat{q} = \phi(q)$ and we used that $\mathrm{d}\pi^\alpha_{\hat{q}}(\partial_\beta|_{\hat{q}})$ acts on functions $f\in\mathcal{C}^\infty(\mathbb{R})$ by $$\mathrm{d}\pi^\alpha_{\hat{q}}(\partial_\beta|_{\hat{q}})f = \partial_\beta|_{\hat{q}}(f\circ\pi^\alpha) = \frac{\mathrm{d} f}{\mathrm{d} t}(\hat{q}^\alpha)\frac{\partial\pi^\alpha}{\partial y^\beta}(\hat{q}) = \delta^\alpha_\beta \left.\frac{\mathrm{d} }{\mathrm{d} t}\right|_{\hat{q}^\alpha} f,$$ where $\hat{q}^\alpha = \pi^\alpha(\hat{q})$, and we identify $T_{\hat{q}^\alpha}\mathbb{R} \cong \mathbb{R}$. By linearity then, $$\mathrm{d}\phi^\alpha_{F(p)}\circ\mathrm{d}F_p\left(\left.\frac{\partial}{\partial \xi^\mu}\right|_p\right) = \mathrm{d}\phi^\alpha\left(a_\mu^\beta\left.\frac{\partial}{\partial \phi^\beta}\right|_{F(p)}\right) = a^\beta_\mu\delta^\alpha_\beta = a^\alpha_\mu.$$ (We use the Einstein summation convention.) Hence, we "just" need to calculate the left hand side.

Since the coordinate maps $\phi$ and $\xi$ are invertible, we may use the chain rule to write $$\mathrm{d} F_p = \mathrm{d}(\phi^{-1}\circ\phi\circ F\circ\xi^{-1}\circ\xi)_p = \mathrm{d}\phi^{-1}_{\hat{F}(\hat{p})}\circ\mathrm{d}\hat{F}_{\hat{p}}\circ\mathrm{d}\xi_p.$$ Thus $$\mathrm{d}\phi^\alpha_{F(p)}\circ\mathrm{d}F_p = \mathrm{d}\pi^\alpha_{\hat{F}(\hat{p})}\circ\mathrm{d}\hat{F}_{\hat{p}}\circ\mathrm{d}\xi_p = \mathrm{d}\hat{F}^\alpha_{\hat{p}}\circ\mathrm{d}\xi_p,$$ so $$a_\mu^\alpha = \mathrm{d}\hat{F}^\alpha_{\hat{p}}\circ\mathrm{d}\xi_p\left(\left.\frac{\partial}{\partial \xi^\mu}\right|_p\right) = \mathrm{d}\hat{F}^\alpha_{\hat{p}}(\partial_\mu|_{\hat{p}}).$$ The right hand side acts on functions $f\in\mathcal{C}^\infty(\mathbb{R})$ by $$\mathrm{d}\hat{F}^\alpha_{\hat{p}}(\partial_\mu|_{\hat{p}})f = \partial_\mu|_{\hat{p}}(f\circ \hat{F}) = \frac{\mathrm{d}f}{\mathrm{d}t}(\hat{F}^\alpha(\hat{p}))\frac{\partial \hat{F}^\alpha}{\partial x^\mu}(\hat{p}) = \frac{\partial \hat{F}^\alpha}{\partial x^\mu}(\hat{p})\left.\frac{\mathrm{d}}{\mathrm{d}t}\right|_{\hat{F}(\hat{p})} f.$$ Under the identification $T_{\hat{F}(\hat{p})}\mathbb{R} \cong \mathbb{\mathbb{R}}$, we see that \begin{equation*} a_\mu^\alpha = \frac{\partial \hat{F}^\alpha}{\partial x^\mu}(\hat{p}). \end{equation*} In other words, $\mathrm{d}F_p$ is a nothing but a coordinate free version of the Jacobian of $F$ at $p$!

By linearity, the components of the pullback metric are then given by $$[(F^*g)_p]_{\mu\nu} := (F^*g)_p\left(\left.\frac{\partial}{\partial \xi^\mu}\right|_{p},\left.\frac{\partial}{\partial \xi^\nu}\right|_{p}\right) = g_{F(p)}\left(\mathrm{d}F_p\left(\left.\frac{\partial}{\partial \xi^\mu}\right|_{p}\right),\left(\left.\frac{\partial}{\partial \xi^\nu}\right|_{p}\right)\right) = a_\mu^\alpha a_\nu^\beta (g_{F(p)})_{\alpha\beta},$$ or, using our previous results and suppressing the dependence on the point $p$, we get the formula $$(F^*g)_{\mu\nu} = \frac{\partial \hat{F}^\alpha}{\partial x^\mu}\frac{\partial \hat{F}^\beta}{\partial x^\nu} g_{\alpha\beta}.$$ Thus, we have derived the formula from before, but now it's clear how everything is to be interpreted. The function $\hat{F}$ being differentiated is the coordinate representation of the map $F$; and is thereby a map from $\mathbb{R}^m$ to $\mathbb{R}^n$. The coordinates $x^\mu$ are the (standard) coordinates on $\mathbb{R}^m$, so the differentiation makes perfect sense as simply a partial differentiation of a function from $\mathbb{R}^m$ to $\mathbb{R}^n$. The components $g_{\alpha\beta}$ of the metric on $N$ are just that, $$g_{\alpha\beta} := g\left(\frac{\partial}{\partial \phi^\alpha},\frac{\partial}{\partial \phi^\beta}\right).$$

In our case then, when $M = \mathbb{R}^3$ and $N = \mathbb{S}^2$, with $F = \iota$ the inclusion map and $g_{\alpha\beta} = \delta_{\alpha\beta}$, plugging everything in yields the computations and the result from before.

Thank you for adding the answer for Riemann metrics! One small nitpick: You need $dF_p$ to be injective, otherwise you would only get a non-negative quadratic form. Typically one wants $F$ to be a local diffeo, then this is easily satisfied, but for submannifolds you need to enforce this (local injectivity is not enough). — ctst, Oct 30 '23 at 14:11
@ctst Thank you for pointing this out! I added a comment about it to my answer. — Rolf Alien, Oct 31 '23 at 12:34

Andres Mejia · Answer 3 · 2018-07-12T20:59:07.697

5

If you have a metric $d_Y$ on $Y$ you can pull it back to $X$ via $f:X\to Y$ by setting $d_X(a,b)$ to be $d_Y(f(a),f(b))$.

If $f$ is injective you get an honest metric, but otherwise the “metric” fails the non degeneracy requirement

This is completely analogous to the case of inner products in linear algebra.

edited Jul 12 '18 at 20:59

answered Jul 12 '18 at 20:50

Andres Mejia

20,977

1

It might be worth pointing out that, if $f$ is not one-to-one, then $d_Y$ isn't a metric, because distinct points can have a distance of $0$. – Andreas Blass Jul 12 '18 at 20:56
@AndreasBlass true. I believe there is a term for this type of metric, but the nomenclature is eluding me – Andres Mejia Jul 12 '18 at 20:58
3

I call such things pseudo-metrics. – Andreas Blass Jul 12 '18 at 21:08
2

I think that the OP talks about Riemannian metrics. – Giuseppe Negro Jul 12 '18 at 21:23

What is a pullback of a metric, and how does it work?

3 Answers3

Intuition

Definition

A formula for the pullback metric

Formalism (and a more correct formula)

Linked