Sum of set bits in every element for a natural numbers

Question

I was thinking of a mathematical puzzle with binary representation of numbers, but could not find a convincing answer myself.

Here is the puzzle: Say for some number N, I want to find the sum of the set bits of every number from 1 to N.

For example, for 5 The answer would be: 7 by the following procedure

1 - 1 set bit
2 - 1 set bit
3 - 2 set bits
4 - 1 set bit
5 - 2 set bits

So answer is 1 + 1 + 2 + 1 + 2 = 7

I found that it's easy to just go one by one and add, like I did. I also found that for a number having x bits, they form the pascal triangle with set ones, if number of occurances are counted with same number of set bits, irrespective of value. For example,

when x = 1, we have {1} - 1 set bit occurs once, hence 1.
when x = 2, we have {10, 11} - 1 set bit occurs once, 2 set bits occurs once, hence 1 1
when x = 3, we have {100, 101, 110, 111} - 1 set bit occurs once, 2 set bits occur twice, and 3 set bits occur once, hence 1 2 1

This series is continued. However, summing these up will give me a range, inside of which the answer lies. (Example ans is in [8, 15])

My first solution is the naive approach. Second one is a little mathematical, but not very fruitful.

I was wondering if we could derive a formula any N?

Have a look at https://math.stackexchange.com/q/2415630/42969. — Martin R, Jan 05 '19 at 21:05
One easily sees that F(2^k-1) = k 2^(k-1) . My guess is that a general formula, however, is almost certainly guaranteed to be either recursive, or messy. — Ron Kaminsky, Jan 05 '19 at 21:09
@MartinR thanks for pointing out the mistake with 5 and tagging this answer. Will definitely check, thanks a lot! :D — metamemelord, Jan 05 '19 at 21:09
@RonKaminsky amazing stuff, sir. I verified for a couple of values, it seems to fit. However, how exactly did you find pattern? :p — metamemelord, Jan 05 '19 at 21:17

Ron Kaminsky · Accepted Answer · 2019-01-16T16:47:42.443

1

$F(0) = 0.$

If $2^k \le n \lt 2^{k+1}$, then $F(n) = F(n - 2^k) + F(2^k - 1) + n - 2^k + 1$.

Since $F(2^k -1) = k\,2^{k-1}$, we have $F(n) = F(n-2^k) + k\,2^{k-1} + n - 2^k + 1$.

The recursion works because the numbers between $2^k$ and $n$ all have their highest bit set (those bits give the $n - 2^k + 1$ part of the sum), and the sum of the other bits of those numbers is $F(n - 2^k)$, and the remaining numbers are taken care of by the $F(2^k-1)$ term.

The formula for $F(2^k-1)$ works because each of the $k$ bits of the numbers in ${0, 1,\dots, 2^k - 1}$ is $1$ exactly half of the time.

Edit: Based on Ross Millikan's comment, it is possible to express this as a sum over the bits which are $1$ in $n$, if they are ordered correctly. If ${a_1, a_2,\dots,a_m}$ are the exponents corresponding to the bits which are $1$ in $n$, ordered in increasing order, then $$F(n) = \sum_{i=1}^m a_i\,2^{a_i-1}-i\,2^{a_i}+n+1 = m(n+1) + \sum_{i=1}^m a_i\,2^{a_i-1}-i\,2^{a_i}$$

edited Jan 16 '19 at 16:47

answered Jan 05 '19 at 21:22

Ron Kaminsky

726

Now let the next highest set bit in $n$ be in the $2^m$ position. We can use your recurrence to get $F(n)=F(n-2^k-2^m)+k2^{k-1}+m2^{m-1}+(n-2^k+1)+(n-2^k-2^m+1)$. A bit in the $2^b$ position contributes $b2^{b-1}$ from the first part and $2^b$ times the number of higher bits in the number from the second part. Finally the $+1$s give the total number of bits in the number. For the example of $5=101_2$ we get $2\cdot2^{2-1}+1\cdot 2^0+2=7$ – Ross Millikan Jan 05 '19 at 22:12
@RossMillikan For a little while I thought that there would be an elegant summation form over the bits which are set in $n$, but as you point out in your comment, the $(n-2^k-2^m+1)$ does depend on both $k$ and $m$ and not just $m$. – Ron Kaminsky Jan 05 '19 at 22:37

Fabius Wiesner · Answer 2 · 2019-08-18T06:34:12.930

Not a clean formula, but using Legendre's formula (see Alternate form) and this it can be shown that:

$$F(n) = \frac{(n+1)n}{2}+\sum_{k=1}^{\lfloor{n/2}\rfloor}\{(2k-1)[(g(n, k)-1)2^{g(n, k)+1}+2]-(n+1)\frac{(g(n, k)+1)g(n, k)}{2}\}$$ where: $$g(n, k) = \lfloor\log_2{(n/(2k-1))}\rfloor$$ The following identity has been used:

$$\nu_2(n!) = n-s_2(n)$$

where $\nu_2(n!)$ is the 2-adic valuation of $n!$ and $s_2(n)$ is the sum of ones in the binary representation of $n$. From there one can write:

$$s_2(n)=n-\sum_{k=2}^n{\nu_2(k)}$$ $$\sum_{k=1}^n{s_2(k)}=\frac{n(n+1)}{2}-\sum_{k=2}^{n}{(n-k+1)\nu_2(k)}=\frac{n(n+1)}{2}-(n-2+1)-(n-4+1)2-(n-8+1)3+\ldots-(n-2^{\lfloor{log_2n}\rfloor}+1)\lfloor{log_2n}\rfloor+\ldots=\frac{n(n+1)}{2}-(n+1)\frac{\lfloor{log_2n}\rfloor(\lfloor{log_2n}\rfloor+1)}{2}+\sum_{k=1}^{\lfloor{log_2n}\rfloor}k2^k+\ldots$$

and go on from there doing what has been done for $n$ to $n/(2m-1)$.

Wow, that's quite nice. It wouldn't be quite so messy if you made a shortened definition for $\lfloor\log_2{(n/(2k-1))}\rfloor$. — Ron Kaminsky, Jan 16 '19 at 15:54
@RonKaminsky I have edited the answer as per your suggestion. — Fabius Wiesner, Jan 17 '19 at 08:59

Sum of set bits in every element for a natural numbers

2 Answers2

Linked