Floating point in RE intermediate languages like vine il, bap il, and google/zynamics reil

Question

Are there any technical hurdles to implementing floating point support in re-oriented intermediate languages? I ask because none seem to support it, but give few reasons why. The only comment on the topic I've seen is from Sebastian Porst who in 2010 merely said

REIL is primarily made to find security-relevant bugs in code. FPU code pretty much never plays a role in such code.

score 2 · Accepted Answer · answered Aug 01 '14 at 22:55

Floating point support is possible. I think there are two reasons why it's not common:

Most applications of binary ILs don't work with floating point. For example, most SMT solvers only have support for integer arithmetic operations. Modeling behavior is not very useful if one cannot reason about it.
There are not many mature libraries for arbitrary precision floating point code that can be readily pulled into these projects.

perror · Answer 2 · 2014-08-02T08:05:00.627

There is an excellent recent work to translate floating point instructions to LLVM bitecode language, the project is called McSema and is managed by people at TrailOfBits.

One of the developer promised to get it OpenSource once the code will get in a good shape.

EDIT: I just saw the answer from Ed McMan. I totally agree with him about the fact that the lack of tools handling this kind of problem makes it quite hard to integrate into binary program analysis framework. But, this is already a consequence of the problem, not a cause.

In fact, in my humble opinion, what is making this problem extremely tedious is its own nature. You have to deal with a continuous problem (logic on floating point numbers) and transform it into a discrete one (propositional logic).

The mix of these two models makes it very difficult to handle because a small difference in the input may end-up in a drastically different output (the bit-vector size may also have a big impact on the output). This kind of behavior is quite close to what you encounter in cryptographic hash functions, where a small modification of the input will result in a complete change of the output.

And, this high variability of the output doesn't help tools to wrap all the behaviors into a meaningful logic formula that could be expressed in propositional logic along with the others.

There is maybe some hope if SMT-solvers start to consider mixing usual QF_AUFBV logic (often used for program simulation) and floating point logic (QF_LRA and QF_NRA).

Yeah, I had forgotten about McSema even though I had seen the slides before. — broadway, Aug 02 '14 at 00:45
Apparently mcsema was opensourced today. https://github.com/trailofbits/mcsema — broadway, Aug 07 '14 at 15:42
PANDA also supports lifting FPU operations to LLVM; it does so by using CLANG to compile the QEMU softfloat helper functions to LLVM bitcode, then linking that bitcode into execution at runtime. — Brendan Dolan-Gavitt, Aug 11 '14 at 18:33

Floating point in RE intermediate languages like vine il, bap il, and google/zynamics reil

2 Answers2