"Undoing" an integer wraparound

Question

I ran into an interesting theoretical problem a number of years ago. I never found a solution, and it continues to haunt me when I sleep.

Suppose you have a (C#) application that holds some number in an int, called x. (The value of x is not fixed). When the program is run, x is multiplied by 33 and then written to a file.

Basic source code looks like this:

int x = getSomeInt();
x = x * 33;
file.WriteLine(x); // Writes x to the file in decimal format

Some years later, you discover that you need the original values of X back. Some calculations are simple: Just divide the number in the file by 33. However, in other cases, X is large enough that the multiplication caused an integer overflow. According to the docs, C# will truncate the high-order bits until the number is less than int.MaxValue. Is it possible, in this case, to either:

Recover X itself or
Recover a list of possible values for X?

It seems to me (though my logic could certainly be flawed) that one or both should be possible, since the simpler case of addition works (Essentially if you add 10 to X and it wraps, you can subtract 10 and wind up with X again) and multiplication is simply repeated addition. Also helping (I believe) is the fact that X is multiplied by the same value in all cases - a constant 33.

This has been dancing around my skull at odd moments for years. It'll occur to me, I'll spend some time trying to think through it, and then I'll forget about it for a few months. I'm tired of chasing this problem! Can anyone offer insight?

(Side note: I really don't know how to tag this one. Suggestions welcome.)

Edit: Let me clarify that if I can get a list of possible values for X, there are other tests I could do to help me narrow it down to the original value.

Something along the line of http://en.wikipedia.org/wiki/Modular_multiplicative_inverse — rwong, Jul 23 '14 at 06:05
Yup, and Euler's method seems particularly effective since the factorization of m is just 2^32 or 2^64, plus the exponentiation of a modulo m is straightforward (just ignore overflow there) — MSalters, Jul 23 '14 at 07:54
I think the particular problem is in fact Rational Reconstruction — MSalters, Jul 23 '14 at 07:57
@kevincline Do you mind telling me why my answer is not just sub-optimal (i agree with that) but is wrong? — v010dya, Jul 23 '14 at 08:06
@MSalters: No, that's where you have r*s^-1 mod m and you need to find both r and s. Here, we have r*s mod m and we know everything but r. — user2357112, Jul 23 '14 at 11:17
Why did you multiply these numbers by 33 if you were going to get an overflow, anyway? Was the overflow an accident? When I imagine cases where you'd do it deliberately, most of them already depend on knowledge of modular arithmetic. — user2357112, Jul 23 '14 at 11:27
@user2357112 it could have been a logic bug, changing requirements, or simply someone else's code (and who knows why other people do things!) In my case, it was the last one. — Xcelled, Jul 23 '14 at 14:44
@Xcelled194: It might have been (learned from) part of the DJB hash function. It might have been better than nothing when dedicated multiplication unit don't exist as a standard feature on CPUs back then. http://stackoverflow.com/questions/1579721/why-are-5381-and-33-so-important-in-the-djb2-algorithm — rwong, Jul 23 '14 at 16:41
@rwong wow... You learn something every day! I went back and looked at the old code, and sure enough it was DJB... — Xcelled, Jul 23 '14 at 19:30

user2357112 · Accepted Answer · 2014-07-23T08:16:00.197

50

Multiply by 1041204193.

When the result of a multiplication doesn't fit in an int, you won't get the exact result, but you will get a number equivalent to the exact result modulo 2**32. That means that if the number you multiplied by was coprime to 2**32 (which just means it has to be odd), you can multiply by its multiplicative inverse to get your number back. Wolfram Alpha or the extended Euclidean algorithm can tell us 33's multiplicative inverse modulo 2**32 is 1041204193. So, multiply by 1041204193, and you have the original x back.

If we had, say, 60 instead of 33, we wouldn't be able to recover the original number, but we would be able to narrow it down to a few possibilities. By factoring 60 into 4*15, computing the inverse of 15 mod 2**32, and multiplying by that, we can recover 4 times the original number, leaving only 2 high-order bits of the number to brute-force. Wolfram Alpha gives us 4008636143 for the inverse, which doesn't fit in an int, but that's okay. We just find a number equivalent to 4008636143 mod 2**32, or force it into an int anyway to have the compiler do that for us, and the result will also be an inverse of 15 mod 2**32. (We get -286331153.)

edited Jul 23 '14 at 08:16

answered Jul 23 '14 at 07:04

user2357112

761

5

Oh boy. So all the work that my computer has done building the map was already done by Euclid. – v010dya Jul 23 '14 at 07:53
22

I like the matter-of-fact-ness in your first sentence. "Oh, it's 1041204193, of course. Don't you have that memorized?" :-P – Doorknob Jul 23 '14 at 08:30
Finally, a solid and to-the-point answer. Reading the other answers you'd think modular arithmetic is a lost art.. +1 – Thomas Jul 23 '14 at 11:35
2

It would be helpful to show an example of this working for a couple numbers, such as one where x*33 didn't overflow and one where it did. – Rob Watts Jul 23 '14 at 15:33
2

Mind blown. Wow. – Michael Gazonda Jul 23 '14 at 16:47
4

You don't need either Euclid nor WolframAlpha (certainly!) to find the inverse of 33 modulo $2^{32}$. Since $x=32=2^5$ is nilpotent (of order $7$) modulo $2^32$, you can just apply the geometric series identity $(1+x)^{-1}=1-x+x^2-x^3+\cdots+x^6$ (after which the series breaks off) to find the number $33^{-1}=1-2^5+2^{10}-2^{15}+\cdots+2^{30}$ which is $111110000011111000001111100001_2=1041204193_{10}$. – Marc van Leeuwen Jul 23 '14 at 16:59
1041204193! How could I forget??! No seriously, this is an amazing answer. I'll accept it, but what would really cap it is an example, like @RobWatts said. Consider editing for the benefit of future readers! :) – Xcelled Jul 23 '14 at 18:42
I saw this question in the sidebar and was going to give to answer .. so take yet an other upvote – Jul 23 '14 at 21:05
1

@RobWatts: I'm not sure how to make an example properly illustrative. If I say "1234567890 * 33 gives 40740740370, which truncates to 2086034706, and 2086034706 * 1041204193 gives 2171988082630722258, which truncates to 1234567890", the numbers seem like magic. If I try smaller numbers, like 4-bit integers, it's easier to follow along with the computations, but the reason why any of it works still seems like magic. – user2357112 Jul 24 '14 at 00:12
@Xcelled194: See reply to Rob Watts. Do you have any suggestions for examples? – user2357112 Jul 24 '14 at 00:18
@user2357112 your idea of 4 bit numbers seems good. Maybe if you "promoted" it to an 8 bit int to show the intermediate result before the truncation? – Xcelled Jul 24 '14 at 02:49
You can't "brute-force" the problem for an even number like 60. There will be either zero or four solutions to 60x = y mod 2^32. – kevin cline Jul 24 '14 at 16:44
1

@kevincline: The OP says that if he can narrow the possibilities down, there are other tests he can do to tell which one is correct. He would use those tests, not modular arithmetic, to brute-force search for the correct answer out of the narrowed-down list. – user2357112 Jul 24 '14 at 16:47

v010dya · Answer 2 · 2014-07-23T06:00:46.160

6

This maybe better suited as an question to Math (sic) SE. You are basically dealing with modular arithmetic, since dropping the left-most bits is the same thing.

I am not as good at Maths as the people who are on Math (sic) SE, but i will try to answer.

What we have here is that the number is being multiplied by 33 (3*11), and its only common denominator with your mod is 1. That is because by definition the bits in the computer are powers of two, and thus your mod is some power of two.

You will be able to construct the table where for every previous value you calculate the following value. And the question becomes do the following numbers correspond to only one previous one.

If it were not 33, but a prime or some power of a prime, i believe that the answer would be yes, but in this case… ask on Math.SE!

Programmatic test

This is in C++ because i don't know C#, but the concept still holds. This seems to show that you can:

#include <iostream>
#include <map>

int main(void)
{
    unsigned short count = 0;
    unsigned short x = 0;
    std::map<unsigned short, unsigned short> nextprev;

    nextprev[0] = 0;
    while(++x) nextprev[x] = 0;

    unsigned short nextX;
    while(++x)
    {
            nextX = x*33;
            if(nextprev[nextX])
            {
                    std::cout << nextprev[nextX] << "*33==" << nextX << " && " << x << "*33==" << nextX << std::endl;
                    ++count;
            }
            else
            {
                    nextprev[nextX] = x;
                    //std::cout << x << "*33==" << nextX << std::endl;
            }
    }

    std::cout << count << " collisions found" << std::endl;

    return 0;
}

After populating such a map, you would be always able to get the previous X if you know the next one. There is only a single value at all times.

edited Jul 23 '14 at 06:00

answered Jul 23 '14 at 04:37

v010dya

161

Why would working with a non-negative datatype be easier? Aren't signed and unsigned handled the same way in the computer, only their human output format differs? – Xcelled Jul 23 '14 at 05:10
@Xcelled194 Well, it's easier for me to think about these numbers. – v010dya Jul 23 '14 at 05:15
Fair enough xD The human factor~ – Xcelled Jul 23 '14 at 05:16
I have removed that statement about non-negative to make it more obvious. – v010dya Jul 23 '14 at 05:17
1

@Xcelled194: Unsigned datatypes follow the usual rules of modular arithmetic; signed types do not. In particular, maxval+1 is 0 only for unsigned types. – MSalters Jul 23 '14 at 07:45
@MSalters but if you look at what is happening on the level of bits, then everything is the same. Maximum for bits in one byte is 11111111, when you add 1 you get 0, and it doesn't matter if you write that as 2^8-1 or as -1. – v010dya Jul 23 '14 at 07:51
@Volodya: For addition and subtraction. But -1/5 is 0 and 255/5 is 51. – MSalters Jul 23 '14 at 08:12
@MSalters Actually multiplication, and modular division work as well. Let's try: 11111111*101=(100)11111011, the result can be either 1275 truncated if you use unsigned or -5 untruncated if you are using signed. In the future please test your statements before giving a comment. – v010dya Jul 23 '14 at 20:10

score 2 · Answer 3 · answered Jul 23 '14 at 04:49

2

One way to get it is to use brute force. Sorry I don't know C# but the following is c-like pseudo code to illustrate the solution:

for (x=0; x<=INT_MAX; x++) {
    if (x*33 == test_value) {
        printf("%d\n", x);
    }
}

Technically, what you need is x*33%(INT_MAX+1) == test_value but integer overflow will automatically do the % operation for you unless your language uses arbitrary precision integers (bigint).

What this gives you is a series of numbers that may have been the original number. The first number printed would be the number that would generate one round of overflow. The second number would be the number that would generate two rounds of overflow. And so on..

So, if you know you data better you can make a better guess. For example, common clock maths (overflow every 12 o'clock) tend to make the first number more likely since most people are interested in things that happened today.

answered Jul 23 '14 at 04:49

slebetman

1,464

C# behaves like C with basic types - ie int is a 4 byte signed integer that wraps, so your answer is still good, though brute forcing wouldn't be the best way to go if you have a lot of inputs! :) – Xcelled Jul 23 '14 at 05:07
Yeah, I tried doing it on paper with modulo algebra rules from here: http://math.stackexchange.com/questions/346271/rules-for-algebra-equations-involving-modulo-operations. But I got stuck trying to figure it out and ended up with a brute-force solution :) – slebetman Jul 23 '14 at 05:10
Interesting article, though I'll have to study it a bit more in depth for it to click, I think. – Xcelled Jul 23 '14 at 05:12
@slebetman Look at my code. It seems that there is only a single answer when it comes to multiplying by 33. – v010dya Jul 23 '14 at 05:19
@Volodya: Yeah. I just tested it on my machine. It does look like there's only a single answer to 33. I noticed that most odd numbers only give single answers and even numbers give multiple answers. I wonder what are the rules for multiple/single answers :) – slebetman Jul 23 '14 at 05:36
2

Correction: C int is not guaranteed to wrap around (see your compiler's docs). It's true for unsigned types though. – Thomas Eding Jul 23 '14 at 05:41
@slebetman Yes, it's the modular arithmetic, i did mention it in my answer. I think it has to do with common denominators. So if we were to use ternary rather than binary machines, then 33 would not work, but 46 would. – v010dya Jul 23 '14 at 05:57
@ThomasEding It's also not guaranteed to be 4 bytes wide. – Pharap Jul 23 '14 at 15:49

usr · Answer 4 · 2014-07-23T11:38:27.893

You could the SMT solver Z3 to ask it to give you a satisfying assignment for the formula x * 33 = valueFromFile. It will invert that equation for you and give you all possible values of x. Z3 supports exact bitvector arithmetic including multiplication.

    public static void InvertMultiplication()
    {
        int multiplicationResult = new Random().Next();
        int knownFactor = 33;

        using (var context = new Context(new Dictionary<string, string>() { { "MODEL", "true" } }))
        {
            uint bitvectorSize = 32;
            var xExpr = context.MkBVConst("x", bitvectorSize);
            var yExpr = context.MkBVConst("y", bitvectorSize);
            var mulExpr = context.MkBVMul(xExpr, yExpr);
            var eqResultExpr = context.MkEq(mulExpr, context.MkBV(multiplicationResult, bitvectorSize));
            var eqXExpr = context.MkEq(xExpr, context.MkBV(knownFactor, bitvectorSize));

            var solver = context.MkSimpleSolver();
            solver.Assert(eqResultExpr);
            solver.Assert(eqXExpr);

            var status = solver.Check();
            Console.WriteLine(status);
            if (status == Status.SATISFIABLE)
            {
                Console.WriteLine(solver.Model);
                Console.WriteLine("{0} * {1} = {2}", solver.Model.Eval(xExpr), solver.Model.Eval(yExpr), solver.Model.Eval(mulExpr));
            }
        }
    }

Output looks like this:

SATISFIABLE
(define-fun y () (_ BitVec 32)
  #xa33fec22)
(define-fun x () (_ BitVec 32)
  #x00000021)
33 * 2738875426 = 188575842

score 0 · Answer 5 · answered Jul 23 '14 at 05:35

To undo that result will give you a non-zero finite amount of numbers (normally infinite, but int is a finite subset of ℤ). If this is this acceptable, just generate the numbers (see other answers).

Otherwise you need to maintain a list of history (of finite or infinite length) of the variable's history.

score 0 · Answer 6 · answered Jul 25 '14 at 22:01

As always, there is a solution from a scientist and solution from an engineer.

Above you will find a very good solution from a scientist, which works always, but requires you to calculate “multiplicative inverse”.

Here is a quick solution from engineer, which will not force you to try all possible integers.

val multiplier = 33 //used with 0x23456789
val problemAsLong = (-1947051863).toLong & 0xFFFFFFFFL

val overflowBit = 0x100000000L
for(test <- 0 until multiplier) {
  if((problemAsLong + overflowBit * test) % multiplier == 0) {
    val originalLong = (problemAsLong + overflowBit * test) / multiplier
    val original = originalLong.toInt
    println(s"$original (test = $test)")
  }
}

What are the ideas?

We got overflow, so let’s use larger types to recover (Int -> Long)
We probably lost some bits due to overflow, let’s recover them
The overflow was not more than Int.MaxValue * multiplier

Full executable code is located on http://ideone.com/zVMbGV

Details:

val problemAsLong = (-1947051863).toLong & 0xFFFFFFFFL
Here we convert our stored number to Long, but since Int and Long are signed, we have to do it correctly.
So we limit the number using bitwise AND with bits of Int.
val overflowBit = 0x100000000L
This bit or multiplication of it could be lost by initial multiplication.
It's a first bit outside of Int range.
for(test <- 0 until multiplier)
According to 3rd Idea the maximal overflow is limited by multiplier, so don’t try more than we really need.
if((problemAsLong + overflowBit * test) % multiplier == 0)
Check if by adding possibly lost overflow we come to a solution
val original = originalLong.toInt
Original problem was in Int range, so let’s return to it. Otherwise we could incorrectly recover numbers, which were negative.
println(s"$original (test = $test)")
Don’t break after the first solution, because there are could be other possible solutions.

PS: 3rd Idea is not strictly correct, but left so to be understandable.
Int.MaxValue is 0x7FFFFFFF, but maximal overflow is 0xFFFFFFFF * multiplier.
So the correct text would be “The overflow was not more than -1 * multiplier”.
This is correct, but not everybody will understand it.

"Undoing" an integer wraparound

6 Answers6