I am given a series $X_1, X_2, \dots X_n$ of geometrically distributed (with $p = 0.5$), independent random variables, for some $n$. Since there seem to be multiple definitions around: Think of each $X_i$ as the number of coin flips until you see "heads".
Let $\hat{X} = \max_i(X_i)$. What I want is an upper bound (in $n$) on the expected number of geometrically distributed random variables (with $p = 0.5$) $X'_1, X'_2, X'_m$ that I need to look at before I see a value of at least $\hat{X} + 1$. In other words: I draw $n$ such random variables and determine the maximum. Now, in the second step, I keep on drawing such random variables until I see a value larger than the previously determined variables. How many variables do I expect to draw in the second step?
My assumption is: $m \leq 2n$ or $m \le 3n$. I have two approaches of which I feel like they should show this, but for both I'm not sure whether they hold / how to correctly bound $m$.
Approach 1 : Symmetry and Handwaving
I feel like there should be symmetry at play here. Say I wasn't looking for a value larger than $\hat{X}$, but at least as large as $\hat{X}$. Concatenate both series together to form $X_1, X_2, \dots X_n, X'_1, X'_2, \dots, X_m$. Now, the values of the individual $X_i$ (resp. $X'_i$) are indepentent of my choice of $n$ - so why should the distance "to the left" from $X_n$ to $\hat{X}$ be larger than the distance "to the right" from $X_n$ to the next element of at least the same value as $\hat{X}$? For symmetry reasons, we should expect $m = n$ here.
Now I'm not looking for an element of the same value, but of strictly greater value. Since $P[X_i = \hat{X}] = 2 \cdot P[X_i = \hat{X} + 1]$, the values I'm looking for should be halve as densely distributed… thus we should expect $m = 2n$ … right?
At this point I'm waving my hands really hard and hope for the best. Is this approach valid? Would you (as a reviewer / reader) accept this in a paper, if I used it to prove some property of some data structure?
Approach 2 : Actually do the Maths
One more rigorous approach I thought of is this: From this answer I know that the expected maximum of $n$ such $X_i$ (let's call it $M(\{X_i\})$) is
$$E(M(\{X_1, \dots X_n\})) = \sum_{k \geq 0}\left( 1 - \left(1 - \frac{1}{2^k}\right)^n\right)$$
I feel like if I find an $m$ such that $E(M(\{X'_1, \dots X'_m\})) - E(M(\{X_1, \dots X_n\})) \ge 1$, then I should have proven my point - right?
I chose $m = 3n$. The resulting sum from the above can then be simplified to:
$$\sum_{k \ge 0}\left( \left(1 - \frac{1}{2^k} \right)^n - \left(1 - \frac{1}{2^k} \right)^{3n} \right) \ge 1$$
However, at this point, I seem to not have paid attention in my calculus classes, or just don't see how to bound this series. Does anybody see a way of showing the above? I would be happy with any $m = cn$ for a constant $c$. I plotted it for $m = 3n$ (and $0 \le k \le 1000$), and the left side is well above $1$ for all $n \geq 1$.
Even if I could show that: Is my approach ("if I chose $m$ such that the expected values differ by at least $1$, I have found the upper bound I'm looking for") valid?
Thanks a lot!