To start we can calculate the first and second partial derivatives of $f$
\begin{align}
\frac{\partial}{\partial x_j} f(x) &= \frac{\exp(x_j)}{\sum_i \exp(x_i)} \\
\frac{\partial^2}{\partial^2 x_j} f(x) &= \frac{\exp(x_j)\left( \sum_{i} \exp(x_i)\right ) - \exp(x_j)^2}{\left ( \sum_{i} \exp(x_i) \right )^2} \\
&= \frac{\exp(x_j)\left( \sum_{i\neq j} \exp(x_i)\right)}{\left(\sum_i\exp(x_i)\right)^2} \\
\frac{\partial^2}{\partial x_j \partial x_k} &= -\frac{\exp(x_j)\exp(x_k)}{\left (\sum_{i} \exp(x_i) \right )^2} & (j \neq k)
\end{align}
These are all done using standard rules (chain rule, quotient rule, derivative of log and exp).
Next we are going to try and apply the Gershgorin Circle Theorem to the Hessian of $f$. To do so we need to define Gershgorin discs. For each row $j$, the disk $D_j$ is defined by its center
$$c_j \triangleq Hf(x)_{jj} =\frac{\partial^2}{\partial^2x_j} f(x)$$ and the and its radius
$$
r_j = \sum_{k\neq j} |Hf(x)_{jk}| = \sum_{k \neq j} \left | \frac{\partial^2}{\partial x_j \partial x_k} f(x)\right |
$$
The Gershgorin Circle Theorem asserts that the eigenvalues of $Hf(x)$ are contained in the union of the disks $D_j$. We also know by the Spectral Theorem and the fact that the Hessian is symmetric that the eigenvalues of $Hf(x)$ are real. What we want to show is that $c_j + r_j \leq 1$ and $c_j - r_j \geq 1$ for every $j$ since this will imply that all the eigenvalues of $Hf(x)$ are contained in $[-1,1]$, for any $x$.
To start note that from the calculations above we have
\begin{align}
c_j - r_j &= \frac{\exp(x_j)\left( \sum_{i\neq j} \exp(x_i)\right)}{\left(\sum_i\exp(x_i)\right)^2} - \sum_{k\neq j} \frac{\exp(x_j)\exp(x_k)}{\left (\sum_{i} \exp(x_i) \right )^2} = 0 \geq -1 \\
c_j + r_j &= \frac{\exp(x_j)\left( \sum_{i\neq j} \exp(x_i)\right)}{\left(\sum_i\exp(x_i)\right)^2} + \sum_{k\neq j} \frac{\exp(x_j)\exp(x_k)}{\left (\sum_{i} \exp(x_i) \right )^2} \\
&= \frac{2\exp(x_j)\left( \sum_{i\neq j} \exp(x_i)\right)}{\left(\sum_i\exp(x_i)\right)^2} \leq \frac{1}{2}
\end{align}
To see the last inequality, let $\alpha = \exp(x_j)$ and $\beta = \sum_{k \neq j} \exp(x_k)$. Then it is equivalent to show
$$
\frac{2\alpha\beta}{(\alpha + \beta)^2} \leq \frac{1}{2} \iff 4\alpha \beta \leq (\alpha + \beta)^2 \iff 0 \leq \alpha^2 + 2\alpha\beta + \beta^2 - 4\alpha \beta = (\alpha - \beta)^2
$$
and the last expression clearly holds. (also note that $\alpha,\beta > 0$ so there are no dividing by zero issues). Overall this shows that the real parts of $D_j$ is contained in the interval $[0,1/2]$. This turns out to be stronger than you're asking actually.
Now we need one more analysis fact. For any function $F:\mathbb{R}^d \rightarrow \mathbb{R}^d$, if the operator norm of the Jacobian of $F$ satisfies
$$\max_x ||JF(x)||_{\text{op}} \leq L$$
then $F$ is $L$-Lipschitz. That is to say, under this condition $||F(x) - F(y)||_2 \leq L||x-y||_2$ for every $x,y$. (see here for proof)
Note that $\nabla f:\mathbb{R}^d \rightarrow \mathbb{R}^d$ and by definition the Jacobian $J(\nabla f) = Hf$ the hessian of $f$. The argument above shows that $||Hf(x)||_{\text{op}} \leq 1/2$ since all the eigenvalues of $Hf \in [0,1/2]$. In turn we have $\nabla f$ is $(1/2)$-Lipschitz. Therefore we can arrive at the result
$$||\nabla f(x) - \nabla f(y)|| \leq \frac{1}{2}||x - y||.$$
(Which is actually stronger than what you asked for)
Above we need the eigenvalues to be in $[-1,1]$ to ensure that the $\nabla f$ was 1-Lipschitz, it just happened to be that we could do a little bit better than that.