geometric intuition of $v\otimes w$ [...] in the most concrete way
Let me try to throw in an answer on freshmen level. I deliberately try to leave out any information that is not essential to the problem, e.g. don't bother about the specific dimensions of the quantities.
Say we have a matrix $V$ and a vector $w$ resulting in
$$Vw=v$$
(implying consistent dimensions of the matrix $V$ and of the column vectors $v$ and $w$ over, e.g., $\mathbb{R}$). Now, let's forget about $V$ and try to reconstruct it given only $v$ and $w$.
It is generally not possible to fully reconstruct it from this little information, but we can do our best by trying
$$\tilde{V}:=v\otimes w \frac{1}{\lVert w\rVert^2}$$
In this case the tensor product can be rewritten to
$$v\otimes w := v\,w^{\sf T}$$
Then, checking the ability of $\tilde{V}$ to approximate the action of $V$ we see
$$\tilde{V}w=v\,w^{\sf T}w\frac{1}{\lVert w\rVert^2}=v$$
which is the best we can do.
To make it more concrete, let's throw in some numbers in the easiest non-trivial case:
$$Vw=\left(\begin{matrix}1&0\\0&2\end{matrix}\right)\left(\begin{matrix}1\\0\end{matrix}\right)=\left(\begin{matrix}1\\0\end{matrix}\right)=v$$
Then
$$\tilde{V}=\left(\begin{matrix}1\\0\end{matrix}\right)\left(\begin{matrix}1&0\end{matrix}\right)=\left(\begin{matrix}1&0\\0&0\end{matrix}\right)\approx V=\left(\begin{matrix}1&0\\0&2\end{matrix}\right)$$
The subspace spanned by $w=\left(\begin{matrix}1\\0\end{matrix}\right)$ is accurately mapped by $\tilde{V}$.
To give more (very elementary) geometric insight, let's make the matrix slightly more interesting:
$$V=\left(\begin{matrix}1&1\\0&2\end{matrix}\right)$$
This leaves the $\left(\begin{matrix}1\\0\end{matrix}\right)$-space invariant and scales and shears the $\left(\begin{matrix}0\\1\end{matrix}\right)$-space. We now "test" the matrix by the "arbitrary" vector $w=\left(\begin{matrix}1\\1\end{matrix}\right)$ and get
$$Vw=\left(\begin{matrix}1&1\\0&2\end{matrix}\right)\left(\begin{matrix}1\\1\end{matrix}\right)=\left(\begin{matrix}2\\2\end{matrix}\right)=v$$
and the approximate reconstruction
$$\tilde{V}=\left(\begin{matrix}2\\2\end{matrix}\right)\left(\begin{matrix}1&1\end{matrix}\right)\frac{1}{2}=\left(\begin{matrix}1&1\\1&1\end{matrix}\right)\approx V=\left(\begin{matrix}1&1\\0&2\end{matrix}\right)$$
We have lost the information about the invariance of the $\left(\begin{matrix}1\\0\end{matrix}\right)$-axis and the scaling & shearing behavior of the $\left(\begin{matrix}0\\1\end{matrix}\right)$-axis, but the diagonal $\left(\begin{matrix}1\\1\end{matrix}\right)$-space's behavior is captured by $\tilde{V}$.
To increase the level of this answer by an $\varepsilon>0$, the spectral decomposition of a diagonalizable matrix $V$ should be mentioned. If $v_1,\dots,v_n$ are normalized eigenvectors with corresponding eigenvalues $\lambda_1,\dots,\lambda_n$, then
$$V=\sum_{i=1}^n v_i\otimes v_i \lambda_i$$
More intuitively, we can say - very roughly speaking - that the tensor product $v\otimes w$ is the pseudo-inverse operation of the matrix vector multiplication $Vw=v$.
To finish this, I'd like to give one concrete (and special) application of the tensor product in partial differential equations. It links this question to the divergence theorem, which is itself very intuitive. To keep it descriptive, let's stick to the case of solid mechanics (because everyone has inherent understanding of bodies and forces), although this holds for all elliptic PDEs in general.
Say we have an elastic body (e.g. a rubber band) occupying a domain $\Omega\subset\mathbb{R}^3$. Some part of it is subject to Dirichlet boundary conditions (e.g. you're statically stretching it). This results internal stresses, which are typically represented by matrices $\sigma\in\mathbb{R}^{3\times3}$. The system of PDEs describing this state is
$${\rm div}(\sigma)=0$$
(considering the boundary conditions and the governing material law which closes these equations).
In some applications you are interested in the average stress throughout the body $\Omega$
$$\bar{\sigma}=\frac{1}{\mu(\Omega)}\int_\Omega \sigma \, {\rm d}\mu$$
The divergence theorem implies that the average stress can be represented by the surface integral
$$\bar{\sigma}=\frac{1}{\mu(\Omega)}\int_{\partial\Omega} \left(\sigma n\right) \otimes x \, {\rm d}A$$
where $\partial\Omega$ is the boundary of $\Omega$, $n$ is the surface normal, and $x$ the coordinate vector.
This means that the volume average of a matrix can be represented integrating the pseudo-inverse of its application over the surface.