7

I have binary data representing a table.

Here's the data when I print it with Python's repr(): \xff\xff\x05\x04test\x02A\x05test1@\x04\x03@@\x04\x05@0\x00\x00@\x05\x05test2\x03\x05\x05test1\x06@0\x00\x01@\x00

Here's what the table looks like in the proprietary software.

        test1        
        test1test1
test          test1
test1                
test1test2        
                        
                        
test1                
test1                
test1                
        test1        
        test1        
                        
                        
test1                
test1                

I was able to guess some of it:

  • It's column by column then cell by cell, starting at the top left cell.
  • The \x04 in \x04test seems to be the length (in bytes I guess) of the following word.
  • @ mean the last value

Anyone knows if the data is following a standard or have any tips how to decode it?

Thanks!

Here's an example with python :

from struct import unpack


def DecodeData(position):
    print "position", position
    firstChar = data[position:][:1]
    size_in_bytes = unpack('B', firstChar)[0]
    print "firstChar: {0}. size_in_bytes: {1}".format(repr(firstChar), size_in_bytes)
    return size_in_bytes


def ReadWord(position, size_in_bytes):
    word = unpack('%ds' % size_in_bytes, data[position:][:size_in_bytes])[0]
    print "word:", word

data = "\xff\xff\x05\x04test\x02A\x05test1@\x04\x03@@\x04\x05@0\x00\x00@\x05\x05test2\x03\x05\x05test1\x06@0\x00\x01@\x00"

position = 0

print ""
position += 1
DecodeData(position)
print "\\xff - ?"

print ""
position += 1
DecodeData(position)
print "\\x05 - ?"

print ""
position += 1
size_in_bytes = DecodeData(position)
position += 1
ReadWord(position, size_in_bytes)


print ""
position += size_in_bytes
DecodeData(position)
position += 1
DecodeData(position)
print """'2A' : could be to say that "test" has 2 empty cells before it"""

print ""
position += 1
size_in_bytes = DecodeData(position)
position += 1
word = unpack('%ds' % size_in_bytes, data[position:][:size_in_bytes])[0]
print "word:", word

position += size_in_bytes

DecodeData(position)
print """@: mean that there's another "test1" cell"""

print ""
position += 1
DecodeData(position)
position += 1
DecodeData(position)
print "\\x04\\x03 - Could be that the next value is 3 cells down"

print ""
position += 1
DecodeData(position)
print ""
position += 1
print "@@ - Seems to mean 3 repetitions"

print ""
position += 1
DecodeData(position)
position += 1
DecodeData(position)
print "\\x04\\x05 - Could be that the next value is 5 cells down"

print ""
position += 1
DecodeData(position)
print "@ - repetition"

print ""
position += 1
DecodeData(position)

print ""
position += 1
DecodeData(position)
position += 1
DecodeData(position)
print "\\x00\\x00 - That could mean to move to the first cell on the next column"

print ""
position += 1
DecodeData(position)
print "@ - repetition"

print ""
position += 1
DecodeData(position)
print "\\x05 - ?"

print ""
position += 1
size_in_bytes = DecodeData(position)
position += 1
word = unpack('%ds' % size_in_bytes, data[position:][:size_in_bytes])[0]
print "word:", word
position += size_in_bytes

print ""
DecodeData(position)
print "\\x03 - Could be to tell that the pervious word 'test2' is 3 cells down"

print ""
position += 1
DecodeData(position)
print "\\x05 - ?"

print ""
position += 1
size_in_bytes = DecodeData(position)
position += 1
word = unpack('%ds' % size_in_bytes, data[position:][:size_in_bytes])[0]
print "word:", word
position += size_in_bytes

print ""
DecodeData(position)
print "\\x06 - Could be to tell that the pervious word 'test1' is 6 cells down"

print ""
position += 1
DecodeData(position)
print "@ - repetition"

print ""
position += 1
DecodeData(position)
print "\\0 - ?"

print ""
position += 1
DecodeData(position)
position += 1
DecodeData(position)
print "\\x00\\x01 - Seems to mean, next column second cell"

print ""
position += 1
DecodeData(position)
print "@ - repetition"

print ""
position += 1
DecodeData(position)
print "\\x00 - end of data or column"
bbigras
  • 191
  • 1
  • 5
  • 1
    Do you have the module itself? It would be almost trivial to disassemble the repr function, assuming this is registered the normal way it is in C extension modules. – 0xC0000022L Apr 30 '13 at 20:51
  • I'm not sure I understand what you mean by "module". But here's the data in HEX 0xFFFF050474657374024105746573743140040340400405403000004005057465737432030505746573743106403000014000. I'm using repr() only to get rid of the 'Decode error - output not utf-8' message in python so you can ignore that. – bbigras Apr 30 '13 at 21:05
  • The data belongs to some kind of object and that usually belongs to a module, such as the ones you import in Python ;) – 0xC0000022L Apr 30 '13 at 21:07
  • The data is from a varbinary field in a MSSQL database which is used by a proprietary and uncooperative software. – bbigras Apr 30 '13 at 21:10
  • It would make sense to have the software that processes the data. See here. Basically too little info to help you. – 0xC0000022L Apr 30 '13 at 21:12
  • I don't have the code of the proprietary software but I added some python code that I use to try to figure out the structure. – bbigras May 01 '13 at 18:02
  • It would be extremely helpful to this newbie (me) if a few different examples of the data could be provided. – Sam Axe May 02 '13 at 18:41

1 Answers1

7

Here's an explanation for what I think the individual symbols mean. I'm basing this around the presumption that a little selector is going through the cells, one by one.

  • \xFF = Null cell
  • \x05 = A string is following, with \xNumber coming after the string to define how far to displace the string from the selector's current position, if at all.
  • \xNumber string = A string of length number
  • \x2A = Could be a byte that says not to displace the current string, and also to assume that the next piece of data is defining a string to be placed in the next cell. Questionable meaning.
  • \x04 \xNumber = Move selector ahead \xNumber cells and place previous string into there.
  • 0 \x00 \x0Number = New column, move selector into row \xNumber, and place previous string into there. @ = Place previously used string in the cell following the current one.

So here's my interpretation of the data you're giving us:

  • \xFF\xFF = two null cells
  • \x05 = A cell, singular, with a string, placed following the null cells, because of the \x2A following the string
  • \x04 test = The string.
  • \x2A \x05 test1 = Another string placed into the cell following. No number needed, since \x2A implies that it's being placed right after "test"
  • @ = Place "test1" into the cell after the "test1" string was first placed.
  • \x04 \x03 = Move selector ahead three cells and place test1 where it lands.
  • @@ = Place into the two cells following also.
  • \x04 \x05 @ = Skip four cells, place into two cells.
  • 0 = New column.
  • \x00 \x00 @ = Using string last defined (test1), place into first two cells of the column.
  • \x05 \x05 test2 \x03 = Place a cell three cells afterwords.
  • \x05\x05test1\x06 = Place test1 into a cell 6 after test2
  • @ = Place test1 again, too.
  • 0 = move to next column
  • \x00\x01 = Place previous string at location 01
  • @ = And also at location 02
  • \x00 = Done

Explanation: My method was to look for a pattern, check if the pattern withstood further scrutiny - the first pattern I checked seemed to - and clear up any minor issues I had with it. Seems to have worked.

0xC0000022L
  • 10,908
  • 9
  • 41
  • 79
user336462
  • 186
  • 3