Finding an efficient estimator for $\theta$ in $U[0, \theta]$ in terms of the sample maximum

Question

This question appeared in a past exam paper, in the form:

Let $X = (X_1\dotsc X_n)\in\mathbb{R}^n$ be an i.i.d. sample from $U[0, \theta], \theta>0$

Apply Rao-Blackwell's theorem to the unbiased estimator $2X_1$ using the statistic $X_{(n)} = \max\{X_i\}$ to compute the efficient estimator for $\theta$. (In other parts of the problem we showed that $X_{(n)}$ is a complete sufficient statistic)

My working looks like this:

\begin{align} F_{X_{(n)}}(y) &= P(X_{(n)}\leq y)= F_{X_1}(y)^n \\ &= \left(\frac{y}{\theta}\right)^n\mathbf{1}(0\leq y\leq \theta) + \mathbf{1}(y>\theta)\\ f_{X_{(n)}}(y) &= \frac{ny^{n-1}}{\theta^n}\mathbf{1}(0\leq y\leq \theta)\\ F_{(X_{1},X_{(n)})}(x, y) &= P(X_1\leq x, X_{(n)}\leq y)\\ &= \left\{ \begin{array}{cl} \theta^{-n}xy^{n-1} & :0\leq x\leq y\leq \theta \\ \theta^{-n}y^n &: 0\leq y\leq x\leq \theta \end{array} \right.\\ f{(X_{1},X_{(n)})}(x, y) &= \theta^{-n}(n-1)y^{n-2}\mathbf{1}(0\leq x\leq y\leq \theta)\\ f_{X_{1}|X_{(n)}}(x| y) &=\frac{\theta^{-n}(n-1)y^{n-2}}{\theta^{-n}ny^{n-1}}\mathbf{1}(0\leq x\leq y)\\ &= \frac{n-1}{n}y^{-1}\mathbf{1}(0\leq x\leq y)\\ \mathbf{E}(X_1|X_{(n)}=y) &=\frac{n-1}{n}\int_0^y\frac{x}{y}dy \\ &=\frac{n-1}{n}\frac{y}{2} \\ \mathbf{E}(2X_1|X_{(n)}) &= \frac{n-1}{n} X_{(n)} \end{align} According to Rao-Blackwell's theorem , this should yield an unbiased estimator for $\theta$. Unfortunately, it is not unbiased and also yields an impossible value of $\theta$, since $X_{(n)}<\theta$.

Playing around with the result, I found that $\frac{n+1}{n} X_{(n)}$ does the job and makes sense, but I can't figure out where I went wrong in my calculations and would be very grateful if someone could point my error out.

https://math.stackexchange.com/q/261530/321264 – StubbornAtom Feb 25 '20 at 18:12 — StubbornAtom, Feb 25 '20 at 18:12

score 3 · Accepted Answer · answered Jun 21 '14 at 23:50

The mistake you made is slightly subtle, especially if one is used to consider (wrongly) that distributions are either continuous or discrete but never a mix of both... Let me try to explain what is going on.

To simplify notations, assume that $\theta=1$ and call $Z=(X_1,X_{(n)})$. Your computation of $F_Z$ is correct but then you consider that the distribution of $Z$ has a density $f_Z$ while this is only partly true in the sense that a first part of the distribution of $Z$, of mass $\frac{n-1}n$, indeed has the density you write down as $f_Z$, but that there exists a second part, with no density, corresponding to the event $[X_1=X_{(n)}]$, of mass $\frac1n$.

The most convenient way to describe the distribution of $Z$ might be to specify that, for every measurable bounded function $u$, $$ E(u(Z))=\iint_{\mathbb R^2} u(x,y)f(x,y)\,\mathrm dx\mathrm dy+\int_{\mathbb R} u(x,x)g(x)\,\mathrm dx, $$ with $$ f(x,y)=(n-1)y^{n-2}\,\mathbf 1_{0\lt x\lt y\lt 1},\qquad g(x)=x^{n-1}\,\mathbf 1_{0\lt x\lt1}. $$ Note that $f$ and $g$ are nonnegative and such that $$ \iint_{\mathbb R^2} f(x,y)\,\mathrm dx\mathrm dy=\frac{n-1}n,\qquad\int_\mathbb R g(x)\,\mathrm dx=\frac1n, $$ whose sum is $1$, hence the identity above, valid for every suitable $u$, indeed defines a distribution.

The distribution of $X_{(n)}$ has, as you computed, density $h$, where $$ h(y)=\int_0^yf(x,y)\,\mathrm dx+g(y)=ny^{n-1}\,\mathbf 1_{0\lt y\lt1}. $$ In this context, a formulation of the conditional distribution of $X_1$ conditionally on $[X_{(n)}=y]$, for $y$ in $(0,1)$, is $$ P(X_1\in\mathrm dx\mid X_{(n)}=y)=\mu_y(\mathrm dx), $$ where $$ \mu_y(\mathrm dx)=\frac{f(x,y)}{h(y)}\,\mathrm dx+\frac{g(x)}{h(y)}\,\delta_y(\mathrm dx)=\frac{n-1}n\frac1y\,\mathbf 1_{0\lt x\lt y}\,\mathrm dx+\frac1n\,\delta_y(\mathrm dx). $$ This means, in particular, that $$ E(X_1\mid X_{(n)})=v(X_{(n)}), $$ where, for $y$ in $(0,1)$, $$ v(y)=\int_\mathbb Rx\,\mu_y(\mathrm dx)=\int_0^yx\,\frac{n-1}n\frac1y\,\mathrm dx+\frac1n\,y=\frac{n+1}{2n}\,y, $$ hence $$ E(X_1\mid X_{(n)})=\frac{n+1}{2n}\,X_{(n)}, $$ which reconciles your result with Rao-Blackwell theorem. We happy.

Thankyou for the very clear explanation. This has helped a lot — Damian Pavlyshyn, Jun 22 '14 at 07:22
You are welcome, this is a nice question and that you showed your tries was a motivation to answer it. The mystery to me is which parts of this the authors of the exam were expecting to read in the papers... — Did, Jun 22 '14 at 09:29

Finding an efficient estimator for $\theta$ in $U[0, \theta]$ in terms of the sample maximum

1 Answers1

Linked