2

Suppose you repeatedly sample from continuous distribution F with convex support

Let's say you drew 2 4 3 5 in order.

Denote the biggest number at $t$th sampling by $b(t)$, second biggest number at $t$th sampling by $a(t)$

So we have

$a(4)=4$

$b(4)=5$

My question is whether

$$A=\int_{a(t)}^{b(t)}dF(x)$$

will be decreasing, at least in expectation sense, over time.

Again, we repeatedly sample from continuous distribution F with convex support.

Intuitively, this must be true. Just imagine uniform distribution, then the distance between $b(t)$ and $a(t)$ will likely shrink.

But I can't seem to prove mathematically. In fact, I don't even know how to mathematically express the concept of "second biggest".

How should I even proceed?

==============================================

My tentative approach is as follows.

If $b(t+1)>b(t)$, then $a(t+1)=b(t)$

$$A(1)=\int_{}^{b(1)}dF(x)$$ $$A(2)=\int_{b(1)}^{b(2)}dF(x)=F(b(2))-F(b(1))$$ if $b(2)>b(1)$ $$A(t)=\int_{a(t)}^{b(t)}dF(x)$$ $$A(t+1)=\int_{b(t)}^{b(t+1)}dF(x)=F(b(t+1))-F(b(t))$$

user42459
  • 173
  • 1
    You should look up the conceptof «order statistics» – kjetil b halvorsen Oct 24 '17 at 06:17
  • 1
    The changing size of the expected gap will depend on the particular distribution. – Henry Oct 24 '17 at 15:03
  • @Henry But intuitively, as t goes to infinity, both the max and second_max will go to the very top of the distribution. Then the quantity A will become extremely small. Even before t goes to infinity, won't this generally hold? Suppose F is continuous and its support is convex. – user42459 Oct 24 '17 at 18:42
  • 1
    @kjetilbhalvorsen Yes I can now at least denote max and second_max but still this intuitive concept is really hard to prove. Could you provide some idea? – user42459 Oct 24 '17 at 18:43

1 Answers1

2

For a uniform distribution, for example on $[0,1]$, you are correct. The difference between the highest value and the second highest value of $t$ values is distributed with density $t(1-x)^{t-1}$ on $[0,1]$ and the expected difference is $\frac{1}{t+1}$, which falls towards $0$ as $t$ increases without limit

But for an exponential distribution, say with parameter $\lambda$, the difference between the highest value and the second highest value of $t$ values is also exponentially distributed with parameter $\lambda$ and so the expected difference is $\frac{1}{\lambda}$ which does not change as $t$ increases (an earlier question and answer gives more details)

And for probability distributions with a heavier right tail than an exponential distribution, such as a log-normal distribution, the expected difference between the highest value and the second highest value of $t$ values can increase as $t$ increases, and for some distributions can even be infinite.

So the truth of your idea will depend on the particular distribution

Henry
  • 157,058
  • Thank you very much. However, isn't your answer about b(t)-a(t) ? However, my question was about the integrated density between the area [a(t) b(t)]. Although the expected distance between b(t) and a(t) may become larger, the integrated mass between the two will perhaps shrink at least according to my intuition though I can't just prove it. – user42459 Oct 25 '17 at 16:28
  • @user42459 - my apologies - probably reading too fast. Yes, you are right. Indeed, for a continuous random variable, integrating the density between the highest and second highest observation is equivalent to sampling and ordering from a uniform distribution on $[0,1]$, so my first paragraph applies, with the expected amount being $\frac{1}{t+1}$ and the density as I described – Henry Oct 25 '17 at 17:16