I've started using bindiff recently and struggle to understand matching algorithm. I've read several articles like:
T. Dullien and R. Rolles. Graph-based comparison of executable objects, BinSlayer, Accurate Comparison of Binary Executables, and also bindiff manual https://www.zynamics.com/bindiff/manual/#chapUnderstanding question about bindiff
And all articles differs on explaining matching algorithm.
The thing that I can't understand is difference between functions signatures and functions attributes.
From manual:
The signature consists of:
- Number of codeblocks
- Number of edges between codeblocks
- Number of calls to subfunctions
Once the two sets of signatures (for the two executables) have been generated, initial matches are created. A match is created if a signature occurs once (and only once) in both examined subsets of signatures.
So the signature of functions are used to construct initial match. But after that article tells us about function attributes.
Attributes:
BinDiff has a list of function attributes (hash matching, name matching, etc.) suitable for generating matches. It starts on a global level, considering all functions of the binary and calculates the first attribute for every function. After the initial global matching step the parents (callers) and children (callees) of each new match are considered
So how the signatures differs from attributes? Both of them are used to construct initial matches. What strategy is first applied to construct initial match: signature matching or functions byte hash matching?