I'm reading Hoffman's "Linear Algebra" Chapter 9 "Operators on Inner Product Spaces" and got lost at the positive property on (sesqui-linear) forms, operators and matrices.
The confusing comes from that the definition of "positive" on matrix is different from forms or operators.
Definitions.
A form $f$ on a real or complex vector space $V$ is called Hermitian if $f(\alpha, \beta) = \overline {f(\beta, \alpha)}$ for all $\alpha$ and $\beta$ in $V$.
A form $f$ on a real or complex vector space $V$ is positive if $f$ is Hermitian and $f(\alpha, \alpha) > 0$ for every $\alpha$ in $V$ that $\alpha \ne 0$.
If $A$ is an $n \times n$ matrix with complex entries and if $A$ satisfies $$\tag{9.9} X^\intercal A X > 0, \forall X \in \mathbb R^n, X \ne 0$$ we shall call $A$ a positive matrix.
A linear operator $T$ on a finite-dimensional inner product space $V$ is positive if $T=T^*$ and $\langle T\alpha, \alpha \rangle > 0$ for all $\alpha$ in $V$.
Notice here "positive" of both form and operator are defined based on conjugate transpose, but positive of matrix is defined with transpose only. Also, positive of both form and operator are defined on "real or complex" vector spaces $V$, with $\alpha \in V$; but position of matrix is defined on complex vector space $V$ but the $X$ is defined in real space -- $\mathbb R^n$.
Then I got lost as he claims in page 329:
In either the real or complex case, a form $f$ is positive if and only if its matrix in some (in fact, every) ordered basis is a positive matrix.
Let me break this into 4 arguments:
(1) real vector space, $f$ is a positive form, then $[f]_\mathcal B$ is a positive matrix.
(2) real vector space, $[f]_\mathcal B$ is a positive matrix, then $f$ is a positive form.
(3) complex vector space, $f$ is a positive form, then $[f]_\mathcal B$ is a positive matrix.
(4) complex vector space, $[f]_\mathcal B$ is a positive matrix, then $f$ is a positive form.
The (1) and (3) seem ok; but (2) and (4), I'm lost: how to prove them?
(1): it is saying --
Let $f$ be a form on a real vector space, $\mathcal B$ an ordered basis, and $[f]_\mathcal B$ the matrix of $f$ under basis $\mathcal B$.
$f$ is a positive form, or, by definition, i) $f$ is Hermitian : $f(\alpha, \beta) = {f(\beta, \alpha)}$ for all $\alpha$ and $\beta$ in $\mathbb R^n$, and ii) $\forall \alpha \in \mathbb R^n, \alpha\ne 0$, $f(\alpha, \alpha) > 0$.
Then $[f]_\mathcal B$ is a positive matrix, or, by definition, $X^\intercal [f]_\mathcal B X > 0, \forall X \in \mathbb R^n, X \ne 0$.
This is easy to prove, $\forall \alpha \in \mathbb R^n, \alpha\ne 0, f(\alpha, \alpha)>0$ direct lead to $X^\intercal [f]_\mathcal B X >0, \forall X\in \mathbb R^n, X\ne 0$.
(3): it is saying --
Let $f$ be a form on a complex vector space, $\mathcal B$ an ordered basis, and $[f]_\mathcal B$ the matrix of $f$ under basis $\mathcal B$.
$f$ is a positive form, or, by definition, i) $f$ is Hermitian : $f(\alpha, \beta) = \overline {f(\beta, \alpha)}$ for all $\alpha$ and $\beta$ in $\mathbb C^n$, and ii) $\forall \alpha \in \mathbb C^n, \alpha\ne 0$, $f(\alpha, \alpha) > 0$.
Then $[f]_\mathcal B$ is a positive matrix, or, by definition, $X^\intercal [f]_\mathcal B X > 0, \forall X \in \mathbb R^n, X \ne 0$.
This is easy to prove, $\forall \alpha \in \mathbb C^n, \alpha\ne 0, f(\alpha, \alpha)>0$ direct lead to $X^\intercal [f]_\mathcal B X >0, \forall X\in \mathbb R^n, X\ne 0$.
(4): I have problem to prove it, which is saying --
Let $f$ be a form on a complex vector space, $\mathcal B$ an ordered basis, and $[f]_\mathcal B$ the matrix of $f$ under basis $\mathcal B$.
$[f]_\mathcal B$ is a positive matrix, or, by definition, $X^\intercal [f]_\mathcal B X > 0, \forall X \in \mathbb R^n, X \ne 0$.
Then $f$ is a positive form, or, by definition, i) $f$ is Hermitian : $f(\alpha, \beta) = \overline {f(\beta, \alpha)}$ for all $\alpha$ and $\beta$ in $\mathbb C^n$, and ii) $\forall \alpha \in \mathbb C^n, \alpha\ne 0$, $f(\alpha, \alpha) > 0$.
The proof is hinted on Hoffman's page 329, that:
$\forall X, Y \in \mathbb R^n$, let $Z = X + iY$, then $Z\in \mathbb C^n$, and: $Z^*A Z = (X+iY)^*A (X+iY) = (X^\intercal - iY^\intercal)A(X+iY)$ $= X^\intercal A X + Y^\intercal A Y + i(X^\intercal A Y - Y^\intercal A X)$.
If $A\in\mathbb R^{n\times n}$, $A = A^\intercal$, then $Y^\intercal A X = X^\intercal A Y$, furthermore, from $X^\intercal AX>0, \forall X\in \mathbb R^n, X\ne 0$ one can derive that $Z^*A Z>0$, $\forall Z \in \mathbb C^n, Z\ne 0$.
But this requires $A\in \mathbb R^{n\times n}$ and $A = A^\intercal$. However $[f]_\mathcal B \in \mathbb C^{n\times n}$, we can't use it as $A$.
Although there is a Principal Axis Theorem:
For every Hermitian form $f$ on $V$, there is an orthonormal basis of $V$ in which $f$ is represented by a diagonal matrix with real entries.
But again, we don't know if $[f]_\mathcal B$ is Hermitian, so can't use the Principal Axis Theorem to choose a $\mathcal B$ so that $[f]_\mathcal B$ is a diagonal matrix with real entries.
(2): I also have problem with it, which is saying --
Let $f$ be a form on a real vector space, $\mathcal B$ an ordered basis, and $[f]_\mathcal B$ the matrix of $f$ under basis $\mathcal B$.
$[f]_\mathcal B$ is a positive matrix, or, by definition, $X^\intercal [f]_\mathcal B X > 0, \forall X \in \mathbb R^n, X \ne 0$.
Then $f$ is a positive form, or, by definition, i) $f$ is Hermitian : $f(\alpha, \beta) = {f(\beta, \alpha)}$ for all $\alpha$ and $\beta$ in $\mathbb R^n$, and ii) $\forall \alpha \in \mathbb R^n$, $f(\alpha, \alpha) > 0$.
Seems i) Hermitian cannot be proved?
Actually Hoffman's book mentioned earlier on page 329 that:
If a real matrix $A$ satisfies (9-9), it does not follow that $A = A^\intercal$.
This is reasonable, as choose $A$ = $\begin{bmatrix} 1 & 0.3 \\ 0.1 & 1 \\ \end{bmatrix}$ , then for $\forall X = $ $\begin{bmatrix} x_1 \\ x_2 \\ \end{bmatrix}$, $X^\intercal A X = x_1^2 + 0.4x_1x_2 + x_2^2 = (x_1+0.2x_2)^2 + 0.96x_2^2 > 0$, but $A\ne A^\intercal$.
I'm lost here. Why does "positive" of a matrix is defined not based on "real or complex" vector space but "complex space"? Do (2) or (4) hold?