I try to find out the structure of encrypted office files (I'd like to implement something like a carving tool for finding certain encrypted file types). I try to understand the Office Document Cryptography Structure Specification but this seems to be very hard, probably because this is too MS specific. What I found out is that when encrypting DOCX-Files, the file header changes from 50 4B 03 04
(typical docx-header) to D0 CF 11 E0
(typical doc header). Within the file I found a XML-File describing the used algorithm, key length, MACs, etc (XmlEncryptionDescriptor
). Does anybody know the general structure and how it changes when encrypting?
I hope this question is not too specific.