10

For example, it takes 7 symbols to write the natural number $n=9999999$ but we can also write it with 5 symbols as $n=10^7-1$. (Of course, with even larger exponents we can save even more symbols.)

Another example: $13841304697 = 7^{12}+8*3^7$. Here we have 11 symbols vs. 8 symbols.

Let's denote by $r(n)$ the minimal number of symbols needed to represent the natural number $n$ with this sort of expressions using exponentiation (to a natural number), $+, -, *$ and $/$.

There are all sorts of interesting questions that arise:

  • How to find the minimal representation? Does some sort of greedy algorithm that takes $a^b$ away from $n$ so that $r(n-a^b)$ is minimal work?

  • What type of numbers $n$ have large values for $r(n)$ relative to the number of digits of $n$?

  • How much writing numbers minimally like this, saves space. To put it formally, what can we say about

$$\frac{\sum_{n=0}^N r(n)}{ \sum_{n=0}^N (\lfloor \log_{10}(n)\rfloor + 1) }?$$

ploosu2
  • 8,707
  • 1
    Can we do something like $3^{2^3}$? – Szeto Aug 25 '18 at 12:03
  • 1
    @Szeto Let's not allow that, but it could be another question. – ploosu2 Aug 25 '18 at 12:25
  • 1
    Isn't it the case that $r(n)-1≤r(n+1)≤r(n)+1$? After all, if $m(n)$ is a minimal expression for $n$ then $m(n)+1$ is an expression for $n+1$ with one more character, and $m(n+1)-1$ is an expression for $n$ with one more character. – lulu Aug 25 '18 at 12:33
  • @lulu Shouldn't it be $r(n)-2 \leq r(n+1) \leq r(n)+2$ then? We count $+$ and $-$ as symbols as well as far as I understood. – MSDG Aug 25 '18 at 14:15
  • @Sobi you may be right, that's why I asked. In either case there is a very tight band of values for $r(n+1)$ given $r(n)$. Even using your system we have $r(n+1)-r(n)\in {-2,-1,0,1,2}$. I'd think the best thing would be to look at the difference sequence...see if there is anything that can be proven. – lulu Aug 25 '18 at 14:31
  • 2
    @lulu Yes, the operator symbols are also counted. Does that inequality give anything else that $r$ grows at max linearly? But we already know that $r$ is $O(\log_{10})$, since it's $\leq$ the number of digits. – ploosu2 Aug 25 '18 at 14:52
  • 2
    Oh, I'm not claiming that this settles discussion of the function. I am just pointing out that the difference sequence is so very constrained that it may make sense to study that sequence instead. As I say, each entry in that sequence can only be one of $5$ numbers...what is the pattern of values? Are they (roughly) uniformly distributed? Is there a bias of some sort or other? I'd guess that $+1,+2$ dominate the list but that is really just a guess. Of course, it may well be hopeless to try to prove anything along these lines. – lulu Aug 25 '18 at 17:50
  • @lulu That's a good remark. Yes, the positive values must dominate, since $r\to \infty$ (since a short sequence of symbols can only produce so large a number). But otherwise, I don't know how to study the behavior of $r$. Even how to calculate its value for some particular numbers other than brute-forcing every possible expression. – ploosu2 Aug 25 '18 at 18:47
  • So, do that. I'd suggest working in binary to keep the character list small. You need to be clearer about what an acceptable character string is, but that should be possible. Having defined a clear rule set, have a machine list all the acceptable strings up to a some (modest) length...and then evaluate them. That should give you $r(n)$ for many $n$. – lulu Aug 25 '18 at 18:55
  • May be worth remarking: absent simple rules which restrict the allowed character strings, you can run into paradoxes of the form "let $n$ be the least natural number which can't be specified by a string of characters as short as this one." I don't think that arises for rules of the sort you have in mind but it may be worth keeping in mind. – lulu Aug 25 '18 at 18:59

1 Answers1

2

Are you intending to restrict this just to base 10? It may be more natural to look at base 2, or to look at what happens in base b.

Note that for any fixed base b, one isn't going on average to do much better than writing in base b, as one can see from a counting argument of how many arrangements of symbols one has. In general, exponentiation and concatenation are tough to work with. Even simpler versions of this problem are not well understood. See my comment on this How many fours are needed to represent numbers up to $N$? question as well as the paper referenced there.

JoshuaZ
  • 1,752