It's not a completely straightforward consequence of the inverse function theorem. Here's a proof.
Theorem. Suppose $(M,g)$ is a Riemannian manifold, $p\in M$, and $B_c(0)\subset T_pM$ is a ball on which $\exp_p$ is defined. Then the restriction of $\exp_p$ to $B_c(0)$ is injective if and only if it is a diffeomorphism onto its image.
Proof.
If the restriction of $\exp_p$ is a diffeomorphism onto its image, then clearly it's injective. For the converse, assume it's injective. For each vector $v\in B_c(0)$, let $\gamma_v(t)= \exp_p(tv)$, which is a geodesic defined for all $t$ such that $|tv|<c$.
There are two key facts needed for the proof. Hopefully the following facts are proved in Klingenberg's book, but I don't have a copy of the book here, so I'll give references to my book Introduction to Riemannian Manifolds (2nd ed.) [IRM].
- A vector $v\in B_c(0)$ is a critical point of $\exp_p$ if and only if $\gamma_v(1)$ is conjugate to $p$ along $\gamma_v$ [IRM, Proposition 10.20].
- No geodesic is minimizing past its first conjugate point [IRM, Theorem 10.26].
Using these facts, the argument goes as follows. It suffices to show that $\exp_p$ has no critical points in $B_c(0)$, for then it's a local diffeomorphism by the inverse function theorem, and an injective local diffeomorphism is a diffeomorphism onto its image.
Suppose for the sake of contradiction that $v\in B_c(0)$ is a critical point. Then by fact 1 above, $\gamma_v(1)$ is conjugate to $p$ along $\gamma_v$. Thus by fact 2, for any number $r$ such that $|v| < |rv| < c$, it follows that $\gamma_v$ restricted to $[0,r]$ not minimizing. Thus there is a shorter geodesic $\sigma: [0,b]\to M$ with $\sigma(0)=p$ and $\sigma(b) = \gamma_v(r)$. Since every geodesic starting at $p$ is the image of a radial line under $\exp_p$, we can parametrize $\sigma$ so that it has parameter domain $[0,1]$ and is of the form $\sigma(t) = \exp_p(tw)$ for some $w\in T_pM$ with $|w| = \text{Len}(\sigma)$, and the fact that $\sigma$ is shorter implies that $|w|<|rv|<c$. Thus $\exp_p(w) = \exp_p(rv)$, contradicting the assumption that $\exp_p$ is injective on $B_c(0)$. $\square$