Henkin models (Godel's original proof was quite different) do indeed satisfy a kind of "minimality" condition, but there are subtleties.
A good further source is the (sadly hard to find) book Henkin-Keisler models.
For any theory $T$, the set of closed terms in the language of $T$ modulo $T$-provable equality yields a structure in the language of $T$ in a natural way which I'll call "$Term(T)$" (ignoring for the moment the issue that there may be no closed terms at all; if this bothers you, either work in the version of FOL which allows empty structures or assume our language contains at least one constant symbol). In general, $Term(T)$ need not satisfy $T$; however, if $T$ is complete and has the strong witness property, we will have $Term(T)\models T$. In this situation we do indeed have a very nice "minimality" property:
Suppose $T$ is complete and has the witness property, and $M\models T$. Then there is a unique elementary embedding of $Term(T)$ into $M$, and the image of this embedding is exactly the set of definable elements of $M$.
Note that this is stronger than merely saying that $Term(T)$ is the prime model of $T$.
This says that "Henkin models" do enjoy a kind of minimality. However, there are subtleties here:
The above ignores the part of the proof where we pass from our initial consistent theory $T_0$ to a theory $T\supseteq T_0$ (in a possibly larger language) which is complete, consistent, and has the strong witness property. There are many ways to do this, essentially depending on how we order the formulas in our language, and these different approaches generally yield different theories (and hence different models at the end of the day).
Moreover, even fixing a single appropriate $T\supseteq T_0$, we still have the issue that the model of $T_0$ we produce at the end of the day isn't $Term(T)$ but rather the reduct of $Term(T)$ to the original language of $T_0$, and this restriction of language will in general kill the minimality observation above.
Let me end by mentioning a follow-up to the first bulletpoint above which I haven't seen treated before. Roughly speaking, given our starting theory $T_0$ we get an equivalence relation on orderings of formulas given by "yield the same model" (there are actually a couple inequivalent approaches here, but ignore that for now). Intuitively, there is a connection between how coarse this relation is and how close to being complete and having the witness property $T$ is (in particular, if we start with a theory which already is complete and has the witness property, then the way we order formulas shouldn't impact anything since there's nothing we need to change). We have various tools for treating that sort of thing (e.g. descriptive set theory), so in principle there could be some interesting dividing lines to be drawn here. However, I haven't seen anything along these lines, and so I don't know if there's actually anything here.