Semantic Code Comparison

Question

Consider two codes that do the same thing, in the same time and memory order, But they don't do it exactly the same way. Is there any Idea for a program to declare those codes as the same?

For Example:

Change Vars 1

a = a + b;
b = a - b;
a = b - a;

Change Vars 2

int temp = a;
a = b;
b = temp;

or consider two bubble sort codes that sort ascending, one starting from the end, one from the beginning of the array.

There is the idea to execute both codes and then check if the memory status is similar after the execution of both codes. But that's not very comprehensive.

Edit: My focus is on Turing complete languages mostly.

I know the term "functional equivalence", although you'd usually just say "equivalence". Note that with side effects, things get tougher. — Raphael, Aug 06 '14 at 07:23
Quick answer, if the syntax grammar of the code is powerful enough that it is Turing complete then you can't not solve this problem (if you do you can solve the halting problem, need to think just a bit to see). If the syntax grammar is restricted enough, you can check for equivalence. They should have taught you this in some third or forth year CS undergrad program under the title theory of computation. — InformedA, Aug 06 '14 at 07:30
There is no way to programmatically check for equivalence between two snippets of code, but you can do an approximate check of equivalence by using test cases, and if all possible input values are bounded you could check all input values within those bounds. Otherwise you have to mathematically prove the equivalence from a case to case basis. — Fors, Aug 06 '14 at 10:15
Thank You All, I am interested only in Turing complete languages & algorithms. Is there any proof that it cannot be done? I can perceive the reason that @randomA brought up, But I need to present solid proof. — Makan, Aug 06 '14 at 12:06
Equivalence is already undecidable for context-free grammars, so it's also undecidable for Turing machines. — G. Bach, Aug 06 '14 at 12:21
Take a look at bisimulation I know you are not comparing state machines; however this should lead you to other references that can help you understand what you seek. — Guy Coder, Aug 06 '14 at 13:28
Much confusion in these comments. Undecidability results only state there is no general method to prove equivalence of two pieces of code. But techniques can be developed that are decidable for larges subsets, and it can be very decidable whether your pieces belong to such subsets. @Raphael has a good collection of pointers to the handling of undecidable problems. Testing can only show they do different things, not that they do not. The only way is proof, which may first require formal semantics. — babou, Aug 06 '14 at 13:30
@MakanTayebi What is your question, then? After a word for this property, or after algorithms deciding it? — Raphael, Aug 06 '14 at 13:33
@GuyCoder Maybe means the reference questions and material linked from them? — Raphael, Aug 06 '14 at 14:15
@GuyCoder Yes I mean the reference questions, in particular http://cs.stackexchange.com/questions/1477 . But I confused intractability and undecidability ... though there could be a similar one for undecidability. But I saw recently a comment by Raphael with other pointers. — babou, Aug 06 '14 at 14:34
@babou Makan explicitly asked for a program that decides equivalence of arbitrary programs, I don't think saying "that's impossible" is confusing (or misrepresenting) the issue. — G. Bach, Aug 06 '14 at 19:53
@G.Bach Makan did not specify arbitrary. He gave two examples and asked whether it can be done. It sometimes can. He does not explicitly ask whether it can be done uniformly and universally. Answering that there is no universal solution is quite correct, but the question does leave the door open for solutions in specific types of programs. Saying as in third comment "there is no way to programmatically check for equivalence ..." is just wrong. There are ways, but they are not universal, as stated at the end of the comment. Undecidability is not the end of the world, or of research. — babou, Aug 06 '14 at 22:04
@babou I did say this: "If the syntax grammar is restricted enough, you can check for equivalence". and G. Bach did mention context-free-grammar. I think we are aware of the possibility for some subsets of programs whose equivalence is decidable. Saying our comments have much confusion is quite harsh. — InformedA, Aug 07 '14 at 03:57
@randomA I agree that you did. My statement was not an assessment of the knowledge of anyone, but of the information passed to readers, which I think is what matters. G.Bach, and any who answered, certainly know it can be done in specific cases, but the gist of his answer is discouraging. It should be stated as a limitation, not as an impossibility. Your own answer was much better in that respect, except for your reference to syntax: this is a semantics problem. I overreacted in my answer by ignoring the issue of decidability. I corrected this (with no hope for a perfect answer :). Thanks — babou, Aug 07 '14 at 08:41
@Raphael considering that there is no absolute answer to this, I was after a proof that shows this is impossible. In order to defend my work against questions about my project later. — Makan, Aug 07 '14 at 08:52
What is your work, and what kind of questions do you fear? Did you try using Rice theorem as I suggested? — babou, Aug 07 '14 at 09:45
@MakanTayebi Your question is so vague that "it's impossible" is as wrong an answer as any. You need to be clearer about what you want, what your hypothesis is and how you justify it. — Raphael, Aug 07 '14 at 12:21
@MakanTayebi I would not want to sound insistant, but I took the trouble to write you a fairly detailed answer, without any reaction on your part. I also asked you explicitly for more details on what your actual problem is, so as to possibly adapt the answer, but you do not respond. I find that, let's say, surprising. We do not ask much for providing information, but recognition of our attempts to help seems a minimum. It is not as if there were tens of answers as sometimes happens on SE. — babou, Aug 15 '14 at 07:43
I am truly thankful. Since I have not been taught in class or don't remember a lot about these topics, I am such a slow learner currently. Your answer ofcourse is a comprehensive one. — Makan, Aug 15 '14 at 11:52
No problem. I was only wondering whether it had been useful, as you said you have to defend your work. Basically, considerable work has been done on this or similar topics, beyong my own knowledge. — babou, Aug 15 '14 at 13:44

babou · Accepted Answer · 2014-08-07T08:35:46.243

I have two news for you. I start with the bad one.

The dark side of the question

Computation theory tells us that checking whether two programs, or program fragments, are equivalent is not decidable.

What that means is only that there is no unique technique that can check that equivalence of any pair of programs. This remains true if you consider a single programming language, as long as it is Turing complete. (Note: I do not understand what you intend when mentionning Turing complete algorithms in your comment - and, by the way, precisions should be integrated in the question, preferably to comments).

No unique technique also means no unique finite set of techniques as they could be applied simultaneously. It also means no infinite set of techniques that is finitely describable, etc.

This can be formally proved on Turing machine with Rice theorem (which is a bit subtle to use). The proof can be tediously transposed to any other Turing complete formalisation of computation. But invoking Church-Turing thesis is usually considered enough.

To summarize it, there is no way you can produce a system that will take two arbitrary program fragments of a Turing Complete language and tell you when they are equivalent semantically, i.e. when their computation results are the same.

But do not despair, there is hope.

The bright side of the question

While the above is true when you make the question so general, it does not mean that this can never be done. Actually, this, or problems very close to it, is the object of considerable reasearch. The undecidability statement should only be seen as a limitation to what is to be expected, but there is considerable room within that limitation.

So, there are many situation when it is actually possible to apply a procedure that will actually decide whether two (fragments of) programs are actually equivalent. The applicability of such a procedure can be defined by a language (that is not Turing complete), or by some limitation on the computational power such fragments can express (so that they do not have to be in the same language).

Much of the research related to type theory also concerns provability of programs properties, and can lead to answers to your question. But that is much outside my competence.

Many other techniques have been developed for your purpose.

About your examples

Your idea of running both codes and comparing results is a good one, at least in simple cases, like your example.

But you have to run the code symbolically, and then use a symbolic computation system to check that the answers are indeed the same.

So, assume that initially a==$a$ and b==$b$, where I use italics for symbolic expressions, i.e. non evaluated formulae. The symbols $a$ and $b$ just stand for themselves, and have no associated value.

running the first code:

a = a + b; - - so a==$a+b$
b = a - b; - - so b==$(a+b)-b$
a = a - b; - - so a==$(a+b)-((a+b)-b)$

Recall, again, that what is in italics is just symbolic expressions, trees if you prefer. There is nothing to be computed.

running the second code:

int temp = a; - - so temp==$a$
a = b; - - so a==$b$
b = temp;; - - so b==$a$

Now you give these results to a symbolic calculator that check that the values of variables a and b are the same at the end. It must be able to simplify expressions such as $(a+b)-b$ and $(a+b)-((a+b)-b)$ to respectively $a$ and $b$, which requires using known algebraic properties of the operators $+$ and $-$.

It is actually a good technique (when applied properly - I goofed my first try),as it allowed me to notice that your two codes were not equivalent, and I corrected the first one.

Running the code symbolically is an example of a general paradigm called abstract interpretation. The "casting out nines" test is a very elementary example of these techniques.

Symbolic evaluation and abstract interpretations are a well studied way of proving things about programs.

It is pretty much what is done by type checkers.

But it is far from the whole story about proving things about programs. Large systems are being developed to prove properties of programs by different means.

In other cases (probably for your sorting examples), it may be better to have a specification of what the program is supposed to do and separately prove the two pieces of code conformant with that specification. This avoid having to consider simultaneously the specifics of two algorithms.

P.S. It is said that undecidability was invented by the scientists to make sure their jobs remain for ever. We never will have a universal solution, but there will always be ways to work out new techniques that give solutions where we did not have one before. The only risk is that computers may become better at doing it than we are. — babou, Aug 07 '14 at 12:14
ad bright side: there are also procedures without restrictions on the input; their limitation is that they may not terminate. That's typically how model checkers work ("keep looking for a counter example"). — Raphael, Aug 07 '14 at 12:23

Semantic Code Comparison

1 Answers1

The dark side of the question

The bright side of the question

About your examples