10

While reading an answer to another question, it was mentioned that "78 9C" was a well-known pattern for Zlib compressed data. Intrigued, I decided to search up the signature on the file signature database to see if there were any related numbers. It wasn't on there. So I checked on Gary Kessler's magic number list to see that it wasn't there either.

I even ended up creating a binary file with the signature at the beginning and ran "file" on it as a sort of "I-doubt-it-will-work-but-maybe" attempt (Since that works with "50 4b" because that is a valid ZIP file header and is commonly in the middle of other files.) But none of these attempts revealed that I was looking at a Zlib signature.

It would appear as though most magic number databases only contain file-format magic numbers rather than numbers to differentiate data in the middle of a file. So, my question is:

Are there any places one could find a list of binary signatures of certain types of data streams that are not file formats themselves? Data that is not a file itself, but rather inside a file.

Thanks in advance.

Archenoth
  • 1,475
  • 13
  • 17
  • 1
    Only FYI: the sequence 78 9C in itself is not magic -- if it was, it would be a fixed signature. The first two bytes of a ZLib compressed file contain flags whose settings are needed for a correct decompression; and certain configurations are more common than others. See http://stackoverflow.com/questions/9050260/what-does-a-zlib-header-look-like for 3 of the most common, and RFC1950 for their meaning. I'd have to re-read the RFC but I think these 2 bytes can have just about any value, and still be a valid ZLib header. – Jongware Nov 02 '13 at 13:09
  • 1
    (Add.) Consider a 'directory' kind of file, where each first long word indicate the length of the next raw chunk. Easy to spot for a human, but hard for a computer (unless specifically told to). – Jongware Nov 02 '13 at 13:11
  • Aye... I've taken a look at the specification and there appears to be very few invalid values for the following bytes. And indeed, I don't really mean to refer to the values as magic numbers since those reference file formats, which is the reason I called them "binary signatures". (Though I am not %100 sure that is correct either.) Some signatures can be magic numbers though, which was why I used magic number databases for my initial checking.

    Also that directory-style file is actually a really good example. Kudos..!

    – Archenoth Nov 02 '13 at 13:13
  • 3
    Full set of possible zlib stream headers: https://groups.google.com/d/msg/comp.compression/_y2Wwn_Vq_E/EymIVcQ52cEJ – Igor Skochinsky Nov 02 '13 at 15:49
  • Thanks, Igor, for that link -- always nice to see an answer from a Definitive Authority. Note his comment "..You would follow this with an attempted decompression.." , in other words: "the proof is in the pudding". If one finds any of these magic pairs, how many bytes would one need to decode to be more than a bit sure? (That is, apart from "all of them". ;-) – Jongware Nov 02 '13 at 18:35
  • That's all kinds of nifty. Will need to read up on it s bit. (And also subscribe to a new news group by the looks of things.) – Archenoth Nov 02 '13 at 21:07

1 Answers1

6

Are you perhaps looking for binwalk? Especially the magic folder of its source code.

jvoisin
  • 2,516
  • 16
  • 23
asdf
  • 84
  • 1