2

Two candidates contest a close election. Each of the $n$ voters votes independently with probability $\frac12$ each way. Fix $\alpha \in (0,1)$. Show that, for large $n$, the probability that the candidate leading after $\alpha n$ votes have been counted is the eventual winner is approximately

$$\frac{1}{2} + \frac{\sin^{-1}(\sqrt{\alpha})}{\pi}\;.$$

Hint: let Sm be the difference between the vote totals of the two candidates when m votes have been counted. What is the approximate distribution of Sαn (when appropriately rescaled)? What is the approximate distribution of $S_n - S_{\alpha n}$ (when appropriately rescaled)? What about their joint distribution? Finally, notice $\displaystyle\sin^{-1}(\sqrt{\alpha}) = \tan^{−1}\left( \frac{\alpha}{1 − \alpha}\right)$

Chill2Macht
  • 20,920
vishmay
  • 263
  • 3
    Please use a more descriptive title. This title could be the title of any question on this site. – joriki Jun 23 '16 at 20:53
  • Corrected thanks! – vishmay Jun 23 '16 at 20:54
  • @joriki do you know how to go on with this question? – vishmay Jun 23 '16 at 20:59
  • I don't, but I'm thinking about it; it's an interesting question. – joriki Jun 23 '16 at 21:00
  • @joriki Actually I just found that there are couple of hints given, they are: find the approximate distribution $S_{\alpha n}$, difference between the total votes of the two candidates, after $\alpha n$ votes have been counted. then find approx distribution of $S_{ n} - S_{\alpha n}$, and finally their joint distribution. But I don't see how these distributions help to tackle the problem – vishmay Jun 23 '16 at 21:04

1 Answers1

4

Define random variables $X_1, X_2,\ldots,X_n$ where $X_i$ equals $1$ if voter $i$ votes for Candidate A, and $-1$ otherwise; define $S_k=X_1 + \cdots + X_k$. Then $S_n>0$ means that Candidate A is the winner, and $S_{\alpha n}>0$ means Candidate A is leading after $\alpha n$ votes. Calculate for any $\alpha\in(0,1)$ that $S_{\alpha n}$ has mean $0$ and variance $\alpha n$. The covariance between $S_n$ and $S_{\alpha n}$ is $ \alpha n$: $$ \operatorname{Cov}(S_n, S_{\alpha n}) = \operatorname{Cov} (S_{\alpha n} + (S_n - S_{\alpha n}), S_{\alpha n}) = \operatorname{Cov}(S_{\alpha n}, S_{\alpha n}) + 0 $$ by independence of $S_{\alpha n}$ and $S_n - S_{\alpha n}$. Hence the correlation between $S_n$ and $S_{\alpha n}$ is $\sqrt \alpha$.

The aim is to calculate $$P(S_n>0\mid S_{\alpha n}>0).$$ If $n$ is large, the joint distribution of $S_n$ and $S_{\alpha n}$ is approximately bivariate normal. (Reason: Apply the CLT to both $S_{\alpha n}$ and $S_n- S_{\alpha n}$ to deduce that each of them is approximately normally distributed. Since these two are independent, we find the joint distribution of $(S_n, S_{\alpha n})$ must be bivariate normal, using $S_n = S_{\alpha n} + (S_n-S_{\alpha n})$.)

The desired result follows from a fact(*) about bivariate normal variables:

If $(A,B)$ are bivariate normal with means $\mu_A$ and $\mu_B$ respectively, and correlation $\rho$, then $$P(B>\mu_B\mid A>\mu_A)=\frac12 + \frac{\arcsin\rho}\pi.$$

(*) The fact can be deduced from this result.

grand_chat
  • 38,951
  • How do you see that $S_{\alpha n}$ and $S_n - S_{\alpha n}$ are independent. Also the final result, is it a known standard result about bivariate normals? Can you also give more details as to how the the result follows from the 'fact' – vishmay Jun 23 '16 at 21:23
  • @Prag1 $S_{\alpha n}$ is based on the first $\alpha n$ voters, and $S_n- S_{\alpha n}$ is based on the remaining voters, and we assume voters are voting independently. – grand_chat Jun 23 '16 at 21:28
  • @Prag1 The 'fact' is a known standard result about the bivariate normal. The cited result calculates $P(B>\mu_B , A>\mu_A)$. If you divide this by $P(A>\mu_A)$ (which equals $\frac12$) you obtain the 'fact'. Put $A:=S_n$ and $B:= S_{\alpha n}$ to solve your problem. We know $\mu_A=\mu_B=0$. – grand_chat Jun 23 '16 at 21:29
  • Ok great thanks. One last question. The Central limit theorem talks of sum iid random variables, is there a version that applies to joint distributions?... Also if you look at the 'hint' in my comment above, it asks to calculate approximate distributions of $S_{n}, S_{n} - S_{\alpha n}$ then consider joint distributions. I don't quite see how your line of thoughts corresponds to that of the hint. Any suggestions most welcome – vishmay Jun 23 '16 at 21:38
  • I have added the hint above, I also wanted to follow the hint to arrive at the answer. In particular what does 'appropriately rescale' mean? – vishmay Jun 23 '16 at 21:49
  • @Prag1 I've edited my answer to explain the joint distribution. I assume 'appropriately rescaled' means you subtract the mean and divide by the SD, obtaining standard normal variables. But this is not necessary, since the 'fact' works for any bivariate normal distribution – grand_chat Jun 23 '16 at 21:54
  • Ok great thanks. I guess I can also use CLT to both $S_{\alpha n}$ and $S_n$ and since they are correlated, conclude that they are bivariate normal? – vishmay Jun 23 '16 at 21:57