Basis for $\Bbb Z[x_1,\dots,x_n]$ over $\Bbb Z[e_1,\dots,e_n]$

Question

I'm reading the introductory bits in Procesi's Lie Groups, and on p. 22 we have (paraphrasing)

Theorem 2. $\mathcal{B}=\{x_1^{\large h_1}\cdots x_n^{\large h_n}: 0\le h_k\le n-k\}$ is a basis for the ring $\Bbb Z[x_1,\dots,x_n]$ considered over $\Bbb Z[e_1,\dots,e_n]$, where $e_i$ are the elementary symmetric polynomials in the $x_i$.

I haven't been able to see why this is true. The previous theorem was the fundamental theorem of symmetric polynomials, which was proven inductively with a recursive algorithm:

If $x_n\mid f$ then $x_1\cdots x_n\mid f$, and dividing out we are left with a symmetric polynomial of smaller degree than before. Otherwise, write $f(x_1,\dots,x_{n-1},0)$ as a polynomial $p$ in the elementary symmetric polynomials $\hat{e}_i$ of the first $n-1$ variables, $p(\hat{e}_1,\dots,\hat{e}_{n-1})$. Now the polynomial $$f(x_1,\dots,x_n)-p(e_1,\dots,e_{n-1})$$ is symmetric in all of $x_1,\dots,x_n$ and evaluates to $0$ at $x_n=0$, i.e., is divisible by $x_n$. Induct.

Is there a straightforward adaptation of this with which we can argue for theorem 2? Or is there perhaps another way to see that it must be true? I feel I am missing something simple here.

Dear anon: You may look at Part 2 of Section G of Galois Theory: Lectures Delivered at the University of Notre Dame, by Emil Artin, freely and legally available here. — Pierre-Yves Gaillard, Apr 16 '12 at 09:27
Dear anon: In fact Artin proves the statement over $\mathbb Q$. Some extra work is needed to prove it over $\mathbb Z$. It is proved in Bourbaki (Alg. IV.6.5, Prop. 5). I wrote a short text about this. — Pierre-Yves Gaillard, Apr 16 '12 at 12:04
@Pierre-YvesGaillard I'm looking at the (wayback-archived version) of your self-contained proofs, and I cannot follow the proof of Theorem 3. At the end of that proof, you say that "$\varphi$ and $\psi$ are inverse isomorphisms". Why is that the case? Wouldn't that argument show that any spanning set of a module is a basis? — darij grinberg, Sep 22 '21 at 20:03
@darijgrinberg - Thanks! I haven't thought about these kinds of things for a very long time, and I don't understand what I wrote, but I'm sure you're right and my text is incorrect. I'll try to fix it, but it might take a long time. I'll let you know. — Pierre-Yves Gaillard, Sep 22 '21 at 22:40
@darijgrinberg - I don't think it's possible to salvage my text. Instead I tried to describe Bourbaki's proof (which I saw that you checked) in an answer in this thread: https://math.stackexchange.com/a/4261642/660. — Pierre-Yves Gaillard, Sep 27 '21 at 12:01
@Pierre-YvesGaillard Thank you! I'll take a look at it tonight; it looks like it will make a good reference for my article. — darij grinberg, Sep 27 '21 at 13:21

score 6 · Accepted Answer · edited Sep 23 '21 at 17:29

6

I suppose I should actually come back to answer this. Main idea: for $1\le d<n$ we have

$$\Bbb Z[x_1,\dots,x_n]^{S_d}=\bigoplus_{j=0}^d \Bbb Z[x_1,\dots,x_n]^{S_{d+1}}x_{d+1}^j. \tag{$\circ$}$$

They're both equal to $\Bbb Z[x_1,\dots,x_n]^{S_{d+1}}[x_{d+1}]$. Using $(\circ)$ we can begin unpeeling $\Bbb Z[x_1,\dots,x_n]$ by setting $d=1,2,\dots,n-1$. What I found interesting about this argument is that the outer layers of the onion are the smaller indices rather than the larger - initially I believed it'd be the reverse.

edited Sep 23 '21 at 17:29

user26857

52,094

answered Sep 08 '14 at 17:01

anon

151,657

I don't understand what the notation $\mathbb{Z}[x_1, \dots, x_n]^{S_{d+1}}$ means. How does $S_{d+1}$ act on $\mathbb{Z}[x_1, \dots, x_n]$? – Kofi Oct 23 '23 at 07:40

score 4 · Answer 2 · answered Sep 22 '21 at 23:03

I'm using this fact in a paper I'm writing, so I've had to find references. Here are the ones I found:

This fact is (DIFF.1.3) in D. Laksov, A. Lascoux, P. Pragacz, and A. Thorup, The LLPT Notes, edited by A. Thorup, 1995--2018. I have read that proof and can vouch for it.
This fact is Chapter IV, § 6, no. 1, Theorem 1 c) of Nicolas Bourbaki, Algebra II: Chapters 4--7, Springer 2003. Again, this is a proof I have read and can confirm.
This fact is (5.1) in I. G. Macdonald, Notes on Schubert polynomials, Montréal 1991.

Here are two further references that prove the analogue of this result for $\mathbb{Q}$ instead of $\mathbb{Z}$:

Emil Artin, Galois theory, lectures delivered at the University of Notre Dame, edited and supplemented by Arthur N. Milgram, Notre Dame Mathematical Lectures 2, University of Notre Dame Press, 2nd edition, 6th printing 1971. In §II.G, Example 2 of this booklet, Artin proves that if $K$ is a field, then
- the monomials $x_1^{h_1} x_2^{h_2} \cdots x_n^{h_n}$ with $h_i < i$ for each $i$ are linearly independent over the field of symmetric rational functions in $x_1, x_2, \ldots, x_n$ (and thus also linearly independent over the ring of symmetric polynomials), and
- each polynomial in $x_1, x_2, \ldots, x_n$ can be represented as a polynomial in the $x_1, x_2, \ldots, x_n$ and the symmetric polynomials such that its degree in $x_i$ is $< i$ (that is, as a linear combination of the monomials $x_1^{h_1} x_2^{h_2} \cdots x_n^{h_n}$ with $h_i < i$ over the ring of symmetric polynomials).
Combining these two facts yields the claim in question when $\mathbb{Z}$ is replaced by a field $K$. I suspect it should be possible to derive the case of $\mathbb{Z}$ from that of a field, although this will unlikely give a particularly beautiful proof.
Adriano M. Garsia, Pebbles and Expansions in the Polynomial Ring, 21 July 2002 seems to prove this result in several ways. The explicit argument does use $\mathbb{Q}$, but I wouldn't be surprised if a proof for $\mathbb{Z}$ could be cobbled together from other things in this rather curious note.

Pierre-Yves Gaillard · Answer 3 · 2021-09-28T09:58:52.863

Here is a proof of a theorem in Bourbaki's Algebra II mentioned by Darij Grinberg in his answer above. Let me paste Darij's reference: Chapter IV, § 6, no. 1, Theorem 1 c) of Nicolas Bourbaki, Algebra II: Chapters 4--7, Springer 2003. I like very much the proofs of Theorem (5.10) p. 16 and Proposition (1.3) p. 62 in Darij's first reference (namely D. Laksov, A. Lascoux, P. Pragacz, and A. Thorup, The LLPT Notes, edited by A. Thorup, 1995--2018). I decided to post this answer just because Bourbaki's books are not freely available. I followed Bourbaki very closely, but I tried to give slightly more details. However I did not understand Bourbaki's proof of Statement (a), and hope the argument below is correct.

Let $A$ be a commutative ring with one; let $x_1,\ldots,x_n$ be indeterminates; set $$ E:=A[x_1,\ldots,x_n]; $$ let $s_0,\ldots,s_n$ be the elementary symmetric polynomials [Wikipedia entry] in $x_1,\ldots,x_n$; let $t_0,\ldots,t_{n-1}$ be the elementary symmetric polynomials in $x_1,\ldots,x_{n-1}$; let $G$ be the group formed by the $A$-automorphisms of $E$ which permute the $x_1,\ldots,x_n$; let $H$ be the subgroup of $G$ formed by the $A$-automorphisms of $E$ which permute the $x_1,\ldots,x_{n-1}$; consider the sub-$A$-algebras of invariants $E^G\subset E^H$; for $\alpha\in\mathbb N^n$ set $x^\alpha:=x_1^{\alpha_1}\cdots x_n^{\alpha_n}$; let $\mathcal B$ be the subset of $E$ formed by the $x^\alpha$ with $\alpha_i<i$ for all $i$; and let $\mathcal C$ be the subset of $A[x_1,\ldots,x_{n-1}]$ formed by the $x^\alpha$ with $\alpha_i<i$ for all $i$.

Theorem. (a) The $A$-algebra $E^G$ is generated by $s_1,\ldots,s_n$, that is $E^G=A[s_1,\ldots,s_n]$.

(b) The elements $s_1,\ldots,s_n$ of $E$ are algebraically independent over $A$.

(c) The set $\mathcal B$ is a basis of the $E^G$-module $E$.

Proof. We argue by induction on $n$, the case $n=0$ being obvious. By induction hypothesis (applied to $A[x_n]$ instead of $A$) we have

(A) The $A[x_n]$-algebra $E^H$ is generated by $t_1,\ldots,t_{n-1}$, that is $E^H=A[t_1,\ldots,t_{n-1},x_n]$.

(B) The elements $t_1,\ldots,t_{n-1}$ of $E$ are algebraically independent over $A[x_n]$.

(C) The set $\mathcal C$ is a basis of the $E^H$-module $E$.

We clearly have

$(1)\qquad s_k=t_k+t_{k-1}x_n\qquad(1\le k\le n-1).$

We claim

$(2)\qquad t_k=\displaystyle\sum_{i=0}^k\ (-1)^{k-i}\,s_i\,x_n^{k-i}$

for $1\le k\le n-1$. We prove $(2)$ by induction on $k$, the case $k=1$ being obvious. We have $$ \sum_{i=0}^k\ (-1)^{k-i}\,s_i\,x_n^{k-i}=\sum_{i=0}^{k-1}\ (-1)^{k-i}\,s_i\,x_n^{k-i}+s_k=x_n\sum_{i=0}^{k-1}\ (-1)^{k-i}\,s_i\,x_n^{k-1-i}+s_k. $$ By the induction hypothesis this is equal to $-x_n\,t_{k-1}+s_k$, and thus to $t_k$ by $(1)$. This proves $(2)$.

Now (A) and $(2)$ imply that the $A[x_n]$-algebra $E^H$ is generated by $s_1,\ldots,s_{n-1}$.

By (B) and (A), there is an endomorphism $u$ of the $A[x_n]$-algebra $E^H$ such that

$(4)\qquad u(t_k)=\displaystyle\sum_{i=0}^k\ (-1)^{k-i}\,t_i\,x_n^{k-i}\qquad(1\le k\le n-1),$

and $(1)$ yields $u(s_k)=u(t_k)+u(t_{k-1})\,x_n$, hence $$ u(s_k)=u(t_k)+u(t_{k-1})\,x_n=\sum_{i=0}^k\ (-1)^{k-i}\,t_i\,x_n^{k-i}+\sum_{i=0}^{k-1}\ (-1)^{k-1-i}\,t_i\,x_n^{k-i}=t_k, $$ that is $u(s_k)=t_k$, and (B) implies that the elements $s_1,\ldots,s_{n-1}$ of $E$ are algebraically independent over $A[x_n]$. Hence the $A[x_n]$-algebra $E^H$ is generated by the algebraically independent elements $s_1,\ldots,s_{n-1}$, hence

(D) the elements $s_1,\ldots,s_{n-1},x_n$ of $E$ are algebraically independent over $A$ and generate the $A$-algebra $E^H$, so that we have $E^H=A[s_1,\ldots,s_{n-1},x_n]$.

Let us prove (a). [As indicated above, I did not understand Bourbaki's proof of (a).] Let $y_1,\ldots,y_n$ be indeterminates. By (D) for each $f\in E^H$ there is a unique $f_*\in A[y_1,\ldots,y_n]$ such that $$ f=f(x_1,\ldots,x_n)=f_*(s_1,\ldots,s_{n-1},x_n) $$ and $f\mapsto f_*$ is an $A$-algebra isomorphism from $E^H$ onto $A[y_1,\ldots,y_n]$ satisfying $(x_n)_*=y_n$. Denote the $y_n$-degree of $f_*$ by $d(f)$. Assume (a) is false and let $f\in E^G$ be a polynomial which is not in the sub-$A$-algebra $A[s_1,\ldots,s_n]$ of $E$ generated by $s_1,\ldots,s_n$. We can assume that $d(f)$ is minimum for these conditions. We clearly have $d(f)\ge1$. Write $$ f_*=f_*(y_1,\ldots,y_n)=\sum_{i=0}^{d(f)}f_{*,i}(y_1,\ldots,y_{n-1})\,y_n^i. $$ Let $g\in E^H$ satisfy $g_*=f_{*,0}$, and thus
$$ g=f_{*,0}(s_1,\ldots,s_{n-1})\in A[s_1,\ldots,s_{n-1}]\subset A[s_1,\ldots,s_n]\subset E^G. $$ We see successively that $y_n$ divides $f_*-g_*$, that $x_n$ divides $f-g$ [because $(x_n)_*=y_n$], that $x_i$ divides $f-g$ for all $i$ [because $f-g$ is $G$-invariant], and that $s_n$ divides $f-g$. In particular $h:=(f-g)/s_n$ is a well defined element of $E^G$. We have $$ f-g=s_n\,h=x_n\,t_{n-1}\,h=\left((-1)^{n-1}\,x_n^n+\sum_{i=1}^{n-1}\ (-1)^{n-1-i}\,s_i\,x_n^{n-i}\right)h, $$ the third equality following from $(2)$. This entails $$ f_*-g_*=\left((-1)^{n-1}\,y_n^n+\sum_{i=1}^{n-1}\ (-1)^{n-1-i}\,y_i\,y_n^{n-i}\right)h_*, $$ and thus $d(h)=d(f)-n$. Since $h$ is in $E^G$ but not in $A[s_1,\ldots,s_n]$, we get a contradiction, and (a) is proved.

Let us prove (b). Let $z$ be an indeterminate. Replacing $z$ with $x_n$ in the well known and straightforward equality $$ (z-x_1)\cdots(z-x_n)=\sum_{k=0}^n\ (-1)^{n-k}\,s_{n-k}\,z^k $$ we get $$ (-1)^{n+1}\,s_n=x_n^n+\sum_{k=1}^{n-1}\ (-1)^{n-k}\,s_{n-k}\,x_n^k. $$ In view of Lemma 2 below (applied to $A[s_1, \ldots , s_{n-1}]$, $x_n$, $s_n$ and $n$ instead of $A$, $x$, $b$ and $\beta$), this implies that

(E) the set $\{1,x_n,\ldots,x_n^{n-1}\}$ is a basis of the $A[s_1,\ldots,s_n]$-module $A[s_1,\ldots,s_{n-1},x_n]=E^H$ [see (D)]

and that $s_n$ is transcendental over $A[s_1,\ldots,s_{n-1}]$. Since the elements $s_1,\ldots,s_{n-1}$ are algebraically independent over $A$ by (D), the proof of (b) is complete.

Finally let us prove (c). Denote the tower of $A$-algebras $$ E^G=A[s_1,\ldots,s_n]\subset E^H=A[s_1,\ldots,s_{n-1},x_n]\subset E $$ by $B\subset C\subset D$. By (E) the set $\{1,x_n,\ldots,x_n^{n-1}\}$ is a basis of the $B$-module $C$ and by (C) the set $\mathcal C$ is a basis of the $C$-module $D$. This implies (c).

Lemma 1. Let $A$ be a commutative ring with one, let $x$ and $y$ be indeterminates, let $b\in A[x]$ be a polynomial of degree $\beta\ge1$ whose leading coefficient is invertible, and let $f$ be in $A[x]$. Then there is a unique representation of $f$ of the form

$(3)\qquad\displaystyle f(x)=\sum_{i=0}^{\beta-1}\ x^i\,g_i(b(x))$

with $g_i\in A[y]$.

Proof. Denoting by $a_{ij}$ the coefficient of $y^j$ in $g_i$ we can rewrite $(3)$ as $$ f(x)=\sum_{i=0}^{\beta-1}\sum_{j=0}^\infty\ a_{ij}\,x^i\,b(x)^j=\sum_{j=0}^\infty\left(\sum_{i=0}^{\beta-1}a_{ij}\,x^i\right)b(x)^j=\sum_{j=0}^\infty\ r_j(x)\,b(x)^j $$ with $\deg r_j<\beta$. Thus it suffices to show the existence and uniqueness of a representation of $f$ of the form

$(4)\qquad\displaystyle f(x)=\sum_{j=0}^\infty\ r_j(x)\,b(x)^j$

with $\deg r_j<\beta$ for all $j$ and $r_j=0$ for $j$ large enough. We prove this by induction on $\deg f$, using the fact that, the leading coefficient of $b$ being invertible, $b$ is not a zero divisor and the division with remainder of any polynomial by $b$ is well defined.

Case $\deg f<0$, that is $f=0$: Assume by contradiction $0=\sum_{j=n}^\infty r_j(x)\,b(x)^j$ with $r_n\ne0$, and thus $0=\sum_{j=0}^\infty r_{j+n}(x)\,b(x)^j$, implying that $b$ divides $r_n$, which is impossible.

Case $\deg f\ge0$: Let $r_0$ be the remainder of $f$ divided by $b$. By induction hypothesis the polynomial $(f-r_0)/b$ has a unique representation of the form $(4)$, and the result follows quickly.

Lemma 2. Let $A$ be a commutative ring with one, let $x$ be an indeterminate, let $b\in A[x]$ be a polynomial of degree $\beta\ge1$ whose leading coefficient is invertible. Then, the set $\{1,x,\ldots,x^{\beta-1}\}$ is a basis of the $A[b]$-module $A[x]$, and the polynomial $b$ is transcendental over $A$.

Proof. This is merely Lemma 1, rewritten abstractly.

I'm not quite sure how you get $d(h) < d(f)$ in the proof of (a). But I can see the same argument being made with the usual total degree $\deg$ instead of $d$, so it's not a serious problem. — darij grinberg, Sep 27 '21 at 18:26
Nice writeup! I've made some other changes, which hopefully are not corrupting the spirit of the answer. (I find the last step to (E) slightly steep, as the $x_n$ is not literally a new indeterminate for the ring $A[s_1, \ldots, s_{n-1}]$ but merely an element transcendental over this ring. But I don't know how I would explain this any better.) — darij grinberg, Sep 27 '21 at 18:44
(I believe your proof of (a) is the same as Bourbaki's, although just as you, I'm not 100% sure of what Bourbaki's is. Fortunately, part (a) is the part with the easiest-to-find proofs in the literature.) — darij grinberg, Sep 27 '21 at 18:48
@darijgrinberg - Thank you very much for everything! I tried to fix the proof of (a). — Pierre-Yves Gaillard, Sep 28 '21 at 00:31
I hope it's correct; I'm out of time. But I'd have argued using total degree, as it's much simpler (multiplication by $s_n = x_1 x_2 \cdots x_n$ clearly increases the total degree by $n$). Thanks once again for the useful writeup! — darij grinberg, Sep 28 '21 at 14:16

Basis for $\Bbb Z[x_1,\dots,x_n]$ over $\Bbb Z[e_1,\dots,e_n]$

3 Answers3

Linked