3

Suppose we have $f:\mathbb{R^2}\rightarrow \mathbb{R}$. Vectors which $f$ act on are column vectors i.e a $2 \times 1$ matrix.

Is the gradiant $\nabla f$ then a row vector? And why is this logical?

  • Related: https://math.stackexchange.com/questions/54355/the-gradient-as-a-row-vs-column-vector?rq=1 – Air Conditioner Feb 04 '18 at 19:16
  • @kimchilover you are right. It should be scalar valued. Otherwise we are talking total derivaitve –  Feb 04 '18 at 19:21
  • This is a matter of taste and a matter of dispute. For me, a row vector, so I can most easily write a Taylor expansion as $f(x+h)=f(x)+\nabla f(x)h + \dots$ without guilt. – kimchi lover Feb 04 '18 at 19:28
  • @kimchilover the linked post is overkill. It seams to talk about things from other perspectives then multivariable calculus. If we do things from a geomery point of view then we are looking at alot of vectors at once. Then we can consider then whatever we want in that context. But in relation to the space on which the function act it must be a row imo –  Feb 04 '18 at 19:31

1 Answers1

1

It is a row.

It is logical because the gradient is suppose to be the differential of a function from $\mathbb{R^n}$ to $\mathbb{R^1}$ therefore a linear map and NOT just a vector. In this sense, it is just a $n\times1$ matrix not a vector of $\mathbb{R^n}$

The whole confusion is caused because we can canonically identify the vectors of $\mathbb{R^n}$ with the linear functions from $\mathbb{R^n}$ to $\mathbb{R^1}$ (which is to say that $\mathbb{R^n}^*\cong \mathbb{R^n}$).

Nick A.
  • 2,069
  • 1
  • 14
  • 25
  • Your final comment is well-taken, but I think the situation a bit more nuanced than you’ve made it out to be. The differential of $f$ certainly has a natural representation as a row vector, but saying that $\nabla f$ is “supposed to be” the differential is a bit of a leap. How do you reconcile this view with the common formula $\nabla f\cdot u$ for a directional derivative? For that to make sense, one would have to introduce a dot product that operates on a row vector on one side and a column vector on another. – amd Feb 04 '18 at 20:34
  • You are right, I agree that the situation gets trickier when one needs to do more manipulations and certainly one must be very careful of the definitions. My thinking is that first comes the proper definitions (i.e. the Chain rule) and then the "handy trick" that we can think the linear function $\nabla f$ as a vector and take the dot product. – Nick A. Feb 04 '18 at 20:46
  • The gradient $\nabla f$ is not the differential $df$, it is a vector not a covector. If you have an inner product then you get the gradient by the Riesz representation theorem such that $\forall v, , \langle w, v\rangle = df(v) = \partial_v f$ then we call $w$ the gradient and denote it $\nabla f$. For a continuously differentiable function $f:\mathbb{R}^n\to \mathbb{R}^m$ the gradient is the transpose of the Jacobian $J_f(\vec{p})\in\mathbb{R}^{m\times n}$. Then the taylor expansion is $f(p+v) = f(p) +df(p)(v) + o(v) = f(p) + (\nabla f(p))^T v + o(v)$, where $df(p)(v)=J_f(p)v$. – lightxbulb May 20 '23 at 23:51