-6

What is data interleaving? Is this something I can use to obfuscate collections of variables?

perror
  • 19,083
  • 29
  • 87
  • 150
dyasta
  • 4,188
  • 3
  • 13
  • 17
  • 3
    if you do (ask and answer yourself), it should at least be a question that others are likely to ask too - I've never heard of Data Interleaving until now. otherwise we can't expect anyone else to answer, which is not why 'answering your own question' is encouraged. – Ange Apr 14 '13 at 19:53
  • 1
    Can you provide an example of a protection that uses such a technique? – Rolf Rolles Apr 14 '13 at 20:42
  • 1
    I may be wrong, but shouldn't something you just thought and may have applications in the field belong in a blog post, or in certain cases a technical paper rather than a Q&A here ? – asheeshr Apr 15 '13 at 04:03
  • @AshRJ: I just threw it out there. I figured you guys would tell me how dumb it was.

    If we are pedantic about what gets posted around here, I think you'll see this community wither and die. It's not a large enough subject to fill a SE site, so I believe we should encourage anything related.

    – dyasta Apr 15 '13 at 06:17
  • @Rolf: Nobody has used it to my knowledge. You can imagine how it could be useful though, particularly in obfuscation of stored data structures, and/or in-memory data structures. I have the base code ready, may use it someday if I ever have anything worth protecting. – dyasta Apr 15 '13 at 06:18
  • @ange: Yea, but do we really need to be that strict? Is this really that inappropriate? Oh well. – dyasta Apr 15 '13 at 06:19
  • 1
    I see your point, but if it's a new idea not seen anywhere else before, it's 'too early' for it for QA (which is the format here), and @RolfRolles is right, if there is not even a PoC somewhere, there's hardly no experience about it.. for this, I'd write a blog post somewhere or present in conference, submit to reddit, let people learn first about it. then individual problems related to that idea could be discussed here. SE is not for any discussion, but more for problem solving (I know, the interwebs can't make things simply). – Ange Apr 15 '13 at 06:25
  • 1
    I liked your other post, the question and its solution. But if you're describing something that is not used anywhere in the wild to your knowledge, then this could not possibly correspond to a question that somebody would have about reverse engineering. I'd write a blog about this and submit it to the reverse engineering reddit rather than post original research on StackExchange. – Rolf Rolles Apr 15 '13 at 06:28
  • I agree guys. I threw in on a blog, and renamed it Binary Interleaving. I will likely leave it at that, doubt it will go any further. It's not exactly a huge revelation, though people do patent even more obvious ideas every day :o – dyasta Apr 15 '13 at 06:34
  • Actually, "data interleaving" is a term that might be found in some obfuscation techniques. It can be used either to describe the interleaving of data and code, or to describe the fact to mix (at a-bit level) the data of two (or more) variables and perform operations on it. It can be found in white-box crypto techniques. I think, I can write an answer that might be interesting. – perror Apr 15 '13 at 08:43
  • I'd be interested to read your answer perror! – dyasta Apr 15 '13 at 15:23
  • See previous question with answer. http://reverseengineering.stackexchange.com/questions/77/is-there-any-way-to-decompile-a-net-assembly-or-program – APerson Apr 16 '13 at 13:41
  • 2
    You guys really need to vote this down 6 times? Is it that offensive? I think I could post some pr0n and get less votes down – dyasta Apr 16 '13 at 20:05
  • @90h Posting spam gets a much bigger penalty than just a few downvotes – asheeshr May 04 '13 at 02:11

1 Answers1

1

Data Interleaving is a term I made up to reflect an idea I've been contemplating lately; interleaving the bits of a set of variables into a single binary blob. Any number and types of variables could be interleaved together. Access to variables in the interleaved blob can even be on-demand, with a controller class encoding or decoding variables on the fly.

What in the World?

Data Interleaving is the process of translating any number of variables to a single binary blob by interleaving the bits of the variables. This obfuscates the variables in memory or external storage. The entire blob need not be decoded to access member variables, though it can be for improved performance.

Why?

This will help complicate reverse engineering of code. It will particularly deter identifying data types and variables. Plaintext is also well obfuscated with this interleave.

Interleave Map

The variables to be encoded could be defined by an array of byte sizes of those variables and, optionally, pointers to a location in memory to retrieve or store their reconstituted form. In the case of on-demand access to an interleaved blob, individual variables can be decoded and re-encoded on the fly, so buffers for reconstituted storage are optional (though they may be temporarily reconstituted by the controller class as members are modified).

The members of the bitwise interleave can be referenced in the source code via their indices. For instance, index 0 may be MY_VARIABLE_INSTANCE. By passing the variable index to an interleave blob controller class, it knows the size and, optionally, a pointer for constituted storage.

Member data types can be anything. They need not be similar. When one variable ends, it is simply ended. See a few paragraphs below for what happens when a single variable is longer than the others.

/* member information */
/* optional pointer to its normal, constituted storage location */
/*  (for use in encoding and decoding the member) */
/* and the size of the member */
class CInterleaveMember
{
  void *pvConstitutedStore;
  unsigned long nMemberByteSize;
};

/* INTERLEAVE MAP */
CInterleaveMember aInterleaveMap[]
  { szSomeString, sizeof(szSomeString) },
  { &nIntegerMan, sizeof(nIntegerMan) },
  { &cMyClass , sizeof(cMyClass) };

void *pBLOB;  /* interleaved data stored in a dynamically allocated blob */

The total size of the blob need not be stored, as it is the sum of all member sizes in the interleave map. The interleave map provides everything we need to know.

The Process

In case it is not clear, the process for the interleave would go something like this: The array of members is 'walked', putting or getting the current bit index from each member variable, advancing to the next bit index after the entire array has been walked. When a member variable is full of bits (exhausted), it is skipped in subsequent interleave iterations (more on long vars later).

For simplicity, let me define a few variables in bits only (not matching above):

szSomeString 0 1 1 1 0 0 1 0 
nIntegerMan  1 1 1 0 0 0 1 1 1 0 0 1 0 0 0 1
cMyClass     0 0 0 1

For the interleave, a bit is taken from each variable in succession.

First iteration of the interleave, get first bit from each ...
 0 1 0
Next iteration(s), get the next bit from each ...
 0 1 0 1 1 0
 0 1 0 1 1 0 1 1 0
 ...

When a Member is Longer than the Others

In the case where one variable is much longer than the others, thus having no pair to encode with, one could use a simple XOR, and/or toss in redundant, unused data from the prior members. Any number of strategies are possible to prevent plaintext storage in the case of an abnormally long variable not having an interleave partner for its ending bits.

Sample Code

For example, the following pseudo-code represents this algorithm:

/* PROTECTED VARIABLES */
/* These get stored in an bitwise interleave in the binary blob */
char szSomeString = "Is there anybody out there?";
unsigned long nIntegerMan = 0x9090; 
MyClass cMyClass("whoopie");

class CInterleaveMember
{
  void *pvConstitutedStore;
  unsigned long nMemberByteSize;
};

/* INTERLEAVE MAP */
CInterleaveMember aInterleaveMap[]
  { szSomeString, sizeof(szSomeString) },
  { &nIntegerMan, sizeof(nIntegerMan) },
  { &cMyClass , sizeof(cMyClass) };

/* NOTE: Total size of the resultant bitwise interleave is the sum of the members of a Interleave Map */

/* INTERLEAVE REFS */
typedef enum 
{
  _szSomeString=0,
  _nIntegerMan,
  _cMyClass,
} InterleavedVariables;

void *pBinaryBlob;  /* dynamically allocated blob storage */

/* Fictional class constructor, passing the interleave map to it */
/* From the interleave map, it can calculate the total blob size, */
/* then dynamically allocate storage for the blob. */
CBitInterleaver cBitInterleave(aInterleaveMap);

/* If the blob is externally loaded, or needs externally stored, we */
/* may need to get access to the blob buffer. Fictional example: */
/* We know the blob size from map! The input size is for safety. */
cBitInterleave.SetBlob(pIncomingBlob, nSrcBufferSize);  

/* Or we can get the blob */
nBlobSize=cBitInterleave.GetBlob(ppOutgoingBlob);

/* Example to encode or decode the entire blob to constituted */
/* storage. We already provided the map, and it decodes or encodes */
/* to the listed pointers.
cBitInterleave.EncodeBlob();
cBitInterleave.DecodeBlob();

/* Example call to decode a member of the array */
/* We pass it the INDEX into the MAP, and dest buffer */
/* From the Index of _nIntegerman, we ALREADY know the size */
/* The out size is for safety. */
cBitInterleave.GetVariable(_nIntegerMan, &nIntegerMan, sizeof(nIntegerMan));

/* OR we can use the default storage address in the interleave map */      
cBitInterleave.GetVariable(_nIntegerMan);

/* Example call to encode a member of the array */
/* We pass it the INDEX into the MAP, and input reference */
cBitInterleave.SetVariable(_szSomeString, &szSomeString, sizeof(szSomeString));

/* And so on... I'm literally coding this in this answer, like a fool */
dyasta
  • 4,188
  • 3
  • 13
  • 17
  • And why vote the answer down? You should have been accountants if you enjoy classification, lol. It discourages anything to be posted, off topic or not. – dyasta Apr 16 '13 at 20:12