Probability that sum of digits of random numbers matches.

Question

Given two random 9-digit numbers (with the standard assumptions), what is the probability that the sum of their digits match?

I saw this, and it seemed extremely relevant.

Also, how does the answer change if we work in different number systems, or allow for "digits" which range from 35 to 54 (e.g., 38 is allowed to be a digit)?

Edit for context: My motivation is strictly intellectual curiosity. My first attempt to solve this problem didn't get very far; I thought the answer might be solved by simple counting and elementary probability. The fact that we are working with sums, and not just the digits themselves though, seemed to complicate things for me. I linked a thread because the general topic of my question (about the sums of digits of random numbers) is directly addressed in that thread (but my exact question is not), showing that I did research and am trying to in good faith spur discussion. I think my question is interesting enough to warrant a thread despite me being clueless (for the most part) as to how to proceed.

Similar approach to here: https://math.stackexchange.com/questions/4215159/on-expected-value-of-x-y-for-independent-and-equally-likely-random-variabl — Annika, Aug 03 '21 at 00:38
Generating functions suggest the answer is about $3.246$ percent. — Brian Tung, Aug 05 '21 at 22:50

Confused Soul · Answer 1 · 2021-08-03T12:50:52.600

I can't provide an exact answer, (heck, I barely understood some answers in the question you linked) but I can give you a few insights that I noticed:

The broader question that you are asking is given a discrete variable X, and a discrete variable Y, find the probability that they are equal. In this case X and Y are the same variable, which is the distribution of sums of digits of a number. If you know the distribution of the sums of digits of a number, your answer is simply $\sum P(i)^2$, meaning the probability that the variables are the same.
This is a fun rough not rigorous potentially incorrect partial solution I came up with. I present it to encourage debate on this question. Since each number is an independent variable, then the mean tends to normally distributed by the CLT. Therefore, the sum is also (almost, it's discrete) normally distributed, being a product of the number of trials and the mean. So, now, again, very roughly, we are drawing 2 samples from a single 'normal distribution' and seeing whether they'll be equal. Note this is not an actual normal distribution. Extrapolating from (1), your thing tends to $\int normal^2 $, where you can envision the integral as a product of ever width-shrinking bars. You then get F1 $\frac{1}{2\sigma\sqrt{\pi}}$ as the answer, where $\sigma$ is the standard deviation of your distribution which I found from Standard Deviation for sums of fair dice given the number of dice, and the number of sides on each die to be F2 $$\sigma=\sqrt{n\cdot\left(\frac{(x+1)\cdot(2x+1)}{6}-\left(\frac{x+1}{2}\right)^2\right)}$$.

Let's now give it a spin on 2 digit numbers and digits up to 6 and see if it works (a dice roll). Naively, using the summation in (1), we get 0.11. Now let's see if our formulae work. The std, by the F2 is 3.24 for the sum of the digits. Now plug it in F1 and it gives 0.087, which is in the neighborhood of the theoretical answer. Asymptotically I believe, the accuracy should get better the larger the number of roles because it more resembles a normal distribution.

Base 10 digits are arbitrary. The answer does not change depending on the base of digit you are using.

Probability that sum of digits of random numbers matches.

1 Answers1