1

Let's say that we have sample $X = (X_1,\dots,X_n)$ from distribution given by: (or rather "derived from" or maybe "defied by"? How should I phrase it correctly in English? I would be grateful for an advice in comments :-))

$$f(x) = \frac{1}{\sigma}\exp\left\{-\frac{x-m}{\sigma} \right \}\mathbf{1}_{(m,\infty)}(x)$$

From this, we can write a density for whole sample

$$f(X) = \frac{1}{\sigma^n}\exp \left \{{\frac{mn}{\sigma}}\right\}\exp\left\{-\frac{n \overline{X}}{\sigma} \right \}\mathbf{1}_{(m,\infty)}(X_{1:n}),$$

where $X_{1:n} = X_{(1)} = \min\{X_1,\dots,X_n\}$ (not sure how it's usually denoted in English literature).

We can see from here (Factorization Theorem) that we need both $X_{1:n}$ and, for example, $\overline{X}$, for our sufficient statistics of parameter $\theta=(m,\sigma)$. However, how do we denote it?

Is it $T(X) = (X_{1:n},\overline{X})$, as $\overline{X}$ would be sufficient statistics for $m$, if we had known the $\sigma$, so we put it first? Or maybe the order doesn't matter and we are free to write $T(X) = (\overline{X},X_{1:n})$? Or maybe there is a reason to write it in the second form?


Extra question:

If we consider $\frac{f(X)}{f(Y)}$ for two samples $X$ and $Y$, we quickly get that this is minimal sufficient statistics. Can we somehow "quickly" tell, whether it's complete statistics, as we can do with certain types of exponential families (where we can apply Lehmann–Scheffé theorem)?

Are there some nice "tricks", when it goes to showing that there's no completeness with distributions like this one, i.e. when density's support depends on parameter?

Kusavil
  • 566
  • 1
    'given by the density function...' works nicely. // For the minimum, notations $X_{1:n}$ and $X_{(n)}$ are both used, usually with definition unless the topic of order statistics is obviously the context. // Sufficient statistics for $(m, \sigma)$ are the minimum and the sum (or mean). MLEs can be expressed in terms of the sufficient statistics // The distribution is called the 'shifted exponential' or 'delayed exponential' distribution. – BruceET Feb 05 '18 at 07:21
  • 1
    If you take the sufficient statistic $(X_{(1)},\sum (X_i-X_{(1)}))$, then this statistic is complete. See https://math.stackexchange.com/questions/3505396/complete-sufficient-statistic-for-double-parameter-exponential/. – StubbornAtom Jan 12 '20 at 18:56

2 Answers2

2

It is common that the order of the statistics in the MSS vector will correspond to the parameters order. I.e., if the vector of unknown parameters is $(\sigma, \mu)$ then $(\bar{X}_n, X_{(1)})$ will be the best choice. That does not have any mathematical reason, but it is a convenient convention.

V. Vancak
  • 16,444
  • And here you list the statistics in opposite order to the parameters: $X_{(1)}$ is MLE for $m$ (or $\mu$). – NCh Feb 06 '18 at 01:54
1

Sufficient statistics are not unique. For a given parametric distribution, there are infinitely many different sufficient statistics.

For example, the iid sample $\boldsymbol x = (x_1, \ldots, x_n)$ is always trivially sufficient--although there is no data reduction achieved. Any permutation of the observations in the sample is also sufficient. Some sufficient statistics achieve some data reduction but not the maximum possible.

Minimal sufficient statistics--i.e., those belonging to this last category--are also again not unique; any bijective function of a minimal statistic is also minimal, and this includes the class of such statistics that are vector-valued. So a permutation of its components is realized by a (invertible) multiplication by a $(0,1)$-matrix.

heropup
  • 135,869