Suppose that there is a positive definite matrix $\mathbf{A} \in \mathbb{R}^{n \times n}$, and a vector $\mathbf{b} \in \mathbb{R}^{n}$, then minimization of quadratic functions with linear terms can be done in closed form as $$\arg\min_{\mathbf{x} \in \mathbb{R}_n } \left( \frac{1}{2}\mathbf{x}^\mathsf{T} \mathbf{A} \mathbf{x} - \mathbf{b}^\mathsf{T}\mathbf{x} \right) = \mathbf{A}^{-1} \mathbf{b}$$
I met this in a machine learning book. However, the book didn't provide a proof. I wonder why this can be well-formed. Hope that someone can help me with it. I find that many machine learning books like to skip all of the proofs, which made me uncomfortable.