Anyone know of any tools or scripts that can help in corpus distillation ? I know of Peach Minset, but not other than that. Appreciate if anyone could share.
Asked
Active
Viewed 1,010 times
2 Answers
3
Some time ago I wrote minblox for that exact purpose. It relies on DynamoRIO. Compared to minset which uses pin tool, there isn't much of a difference. Tho I think actual set minimization part works faster than minset.
Minblox tool is comprised of two parts.
- A DynamoRIO instrumentation part (libbbcoverage) tasked with recording all basic block executed during application execution.
- minblox.py - Python script that runs the DynamoRIO instrumentation and analyzes the log files to minimize the sample set.
Though, do bear in mind that I've only tested this for a specific case I needed it, so your mileage might vary.

0xea
- 4,904
- 1
- 23
- 30
-
Why you didn't use the DrCov tool and then parse the (text format) log files? Just out of curiosity. – joxeankoret Jun 09 '14 at 19:18
-
Sweet Thanks 0xea! Will definitely have a look at that :) – d123 Jun 09 '14 at 19:21
-
What about using Valgrind with Call grind to look at basic blocks? Have either of you tried this method? – d123 Jun 09 '14 at 19:26
-
@joxeankoret no particular reason, wrote it as a test... – 0xea Jun 09 '14 at 20:36
1
As far as I know, there is nothing like this public. However, you can use DrCov and do yourself the same thing as Minset does and even more powerful things. Indeed, I commonly use this DynamoRIO's tool, DrCov, for doing this exact same thing and others.

joxeankoret
- 4,488
- 2
- 21
- 35