A question on minimal polynomials (with relation to some basis)

Question

Let $\mathscr{A}$ be a linear transformation on $n$-dimensional vector space $V$. $0\neq\alpha\in V$ (which a vector field on an arbitrary field $\mathbb{F}\supset \mathbb{Q}$).

Then it is easy to see that there exists a unique polynomial $m_\alpha(\lambda)$ such that $m_\alpha(\mathscr{A})(\alpha)=0$, and for any polynomial $f(\lambda)$ satisfying $f(\mathscr{A})(\alpha)=0$, then $m_\alpha(\lambda)\mid f(\lambda)$.

Now, let $\xi_1,\cdots,\xi_n$ be a basis of $V$, what is the relationship between $m_{\xi_1}(\lambda),\cdots,m_{\xi_n}(\lambda)$ and the minimal polynomial $m_\mathscr{A}(\lambda)$ of $\mathscr{A}$.

I suspect it is $$m_{\xi_1}(\lambda)\cdots m_{\xi_n}(\lambda)=m_\mathscr{A}(\lambda)$$ or the least common multiplier of
$$m_{\xi_1}(\lambda),\cdots,m_{\xi_n}(\lambda)$$ is $$m_\mathscr{A}(\lambda)$$

But at this moment I have no idea to show. What I could see is only that $m_{\xi_i}(\lambda)\mid m_{\mathscr{A}}(\lambda)$.

What is the field under consideration ?. A minimum polynomial may not always factorise in an arbitrary field. — Shailesh, Dec 13 '15 at 02:46

DanielWainfleet · Accepted Answer · 2015-12-13T04:33:29.930

0

Let $f(x)=LCM (m_{e_1}(x),..,m_{e_n}(x)).$ We have $f=m_A$. PROOF: For each $i=1...n$ there exists a polynomial $g_i$ such that $g(x)m_{e_i}(x)=f(x)$, which implies $f(A)(e_i)=g(A)m_{e_i}(A)(e_i)=0.$ Since $e_1,...,e_n$ is a basis and $f(A)(e_i)=0$ for each $i,$ we have $f(A)=0.$ Now for each $i=1,..,n$ let $$h_i=\gcd (m_A,m_{e_i}).$$ There exist polynomials $p_i(x)$ and $q_i(x)$ such that $$h_i(x)=p_i(x)m_A(x)+q_i(x)m_{e_i}(x).$$ Then $h_i(A)(e_i)=0$, and the minimality of $m_{e_i}$ implies that $$m_{e_i}(x)|h_i(x),$$ while by the def'n of $h_i$ we have $$h_i(x)|m_A(x).$$ Therefore each $m_{e_i}$ is a divisor of $m_A.$ In other words $m_A$ is a common multiple of $ (m_{e_1},..,m_{e_n}),$ so the def'n of $f$ implies that $$f|m_A.$$ But $f(A)=0$ so the minimality of $m_A$ implies that $$m_A|f.$$ So to within a non-zero constant multiple we have $m_A=f.$...QED...... For an extreme example where $m_A\ne \prod_1^nm_{e_i}$ take $n>1$ and for $i=1,..,n$ let $e_{n+i}=e_i$ and $A(e_i)=e_{i+1}$. Then $m_A(x)=x^n-1$ while $\prod_1^nm_{e_i}(x)=(x^n-1)^n.$

edited Dec 13 '15 at 04:33

answered Dec 13 '15 at 04:25

DanielWainfleet

57,985

This seems needlessly complicated to me. I have no idea why $\gcd$'s would need to be considered here. You appear to use it to deduce that $m_{e_i}$ divides $m_A$, but that fact is already mentioned in the OP (because $m_A$ can be taken as $f$ satisfying $f(A)(e_i)=0$). Also I don't really get your argument for $m_A\mid f$ by the minimality of $m_A$, since $m_A$ is defined without reference to the vectors $e_i$. – Marc van Leeuwen Dec 15 '15 at 16:04
@MarcvanLeeuwen , $m_A|f$ because $f(A)=m_A(A)=0.$ We have $f(x)=m_A(x)g(x)+h(x)$ for some polynomials $g,h$ with deg ($h$)<deg ($m_A)$. Since $h(A)=0$ we must have $h=0$ otherwise we contradict the minimality of deg ($m_A$). – DanielWainfleet Dec 15 '15 at 21:21
Thanks, the argument for $m_A\mid f$ is now clear to me; it really just uses $f(A)=0$ which was already given in your first paragraph. (Don't know why I did not get this the first time around; maybe I was just confused by $f$ being used differently in this answer here than in OP.) Then since both $m_A\mid f$ and ($m_A(A)(e_i)=0$ leading to) $m_{e_i}\mid m_A$ are clear before the $h_i$ are even introduced, I think this emphasises my point that there is really no need to consider the $h_i$ at all. – Marc van Leeuwen Dec 16 '15 at 08:48
Often my answers,especially in algebra,are more complicated than necessary, especially by including extra details for the sake of rigor . – DanielWainfleet Dec 16 '15 at 09:58

Marc van Leeuwen · Answer 2 · 2015-12-15T14:51:30.853

Let $A$ be the matrix of $\def\A{\mathscr A}\A$ with respect to the basis $\def\B{\mathcal B}\B=[\xi_1,\ldots,\xi_n]$. Then the minimal polynomial $m_\A$ is also the lowest degree monic polynomial such that $m_\A[A]$ is the zero matrix (of size$~n$). Also, since in coordinates with respect to$~\B$ each of its vectors $\xi_j$ is given by$~\def\e{\mathbf e}\e_j$, a standard basis vector of $\Bbb F^n$, the condition for some polynomial$~p\in\Bbb F[\lambda]$ that $p[\A](\xi_j)=0$ means $p[A]\cdot\e_j=0\in\Bbb F^n$, which just says that column$~j$ of the matrix $p[A]$ is zero. By assumption this happens precisely when $p$ is a (polynomial) multiple of the polynomial $m_{\xi_j}$.

Now $p[A]=0$ means that all columns of $p[A]$ are zero, which by the above means that $p$ is a common multiple of $m_{\xi_1},\ldots,m_{\xi_n}$. And $m_\A$ is also the lowest degree monic polynomial with that property, which by definition is the least common multiple of $m_{\xi_1},\ldots,m_{\xi_n}$. This is really all there is to it.

(In fact it was not necessary to use matrices, since $p[\A]=0$ just means that $p[\A](\xi_j)=0$ for all $j$, whence $\def\lcm{\operatorname{lcm}}\lcm(m_{\xi_1},\ldots,m_{\xi_n})\mid p$. But the matrix point of view makes this more visual.)

I should note that this suggests a way to compute $m_\A$ that is in general quite inefficient. The reason is that the polynomials $m_{\xi_j}$ are likely to be strongly interrelated, as can be seen from the fact that each one has a degree $d_j$ that could be as high as$~n$ (and for "generic" $\A$ and $v$ one has $d_j=n$), yet their $\lcm$ still cannot have degree more than$~n$ (by the Cayley-Hamilton theorem). Since $d_1+\cdots+d_n$ is usually much larger than$~n$, one sees that the $m_{\xi_j}$ are then far from being all relatively prime. What is going on is that the kernel of some $m_\alpha[\A]$, which by construction contains the vector $\alpha$, is also an $\A$-invariant subspace, so that it must contain all repeated images by$~\A$ of$~\alpha$. Those images $\A^i(\alpha)$ are linearly independent for $i<d=\deg(m_\alpha)$, so the kernel has at least dimension$~d$. In order to have the relation $m_\A=\lcm(m_{\alpha_1},\ldots,m_{\alpha_k})$ for certain vectors $\alpha_1,\ldots,\alpha_k$, it suffices that the subspace sum of those kernels fill up the whole space; with vectors running through a basis$~\B$ this is assured, but often taking much less (even taking a single vector) will suffice.

But one can even avoid doing any $\lcm$ at all. Having computed some $m_\alpha$, it is easy to see that the (quotient) factor $Q=m_\A/m_\alpha$ that is "missing" from the minimal polynomial is precisely the minimal polynomial of the restriction of $\A$ to the image subspace $W$ of $m_\alpha[\A]$, in other words the minimal degree monic polynomial$~p$ such that $W\subseteq\ker(p[\A])$. This leads to the following simple (at least in theory) algorithm for computing $m_\A$ in terms of a variable polynomial$~p$, initialised to $1$ and whose final value gives $m_\A$, and a variable subspace$~W$, initialised to $V$ and which ultimately becomes $\{0\}$:

While $\dim(W)>0$: choose a nonzero vector $w\in W$, and compute the polynomial $m_w$; replace $p$ by the product $pm_w$, and replace $W$ by its image $m_w[\A](W)$.

A question on minimal polynomials (with relation to some basis)

2 Answers2