9

How do you reliably extract appended data in a Portable Executable?

Ange
  • 6,694
  • 3
  • 28
  • 62

3 Answers3

10

Just to clarify: the appended data - also called overlay - is the part of a PE file that is not covered by the header.

enter image description here

Because of some tricky conditions in the PE file format, it might be difficult to determine in some extreme case, so it's better to rely on a robust library such as pefile.

Here is a simple Python script that relies on pefile to extract the appended data:

import pefile
import sys

filename = sys.argv[1]
with open(filename, "rb") as s:
    r = s.read()

pe = pefile.PE(filename)
offset = pe.get_overlay_data_start_offset()

with open(filename + ".app", "wb") as t:
    t.write(r[offset:])

use with your filename as argument.


Personally, I usually do it with Hiew - as Hiew is faster to start than anything else, and also PE-robust:

  1. go at appended data start
    1. in Hex or ASM mode, press F8 to view PE header information
    2. Alt+F2 to go at appended data start
  2. select until the end
    1. Keypad-* to start selection
    2. Ctrl+End to go at bottom of the file
    3. Keypad-* again to finish selection
  3. F2 to write selection to file
Ange
  • 6,694
  • 3
  • 28
  • 62
6

It is not that hard to do manually.

  1. Find the PE header from the MZ header and determine the location of the section table.
  2. Walk the section table, and determine the maximum value of PointerToRawData + SizeOfRawData. (note: these values are subject to alignment using the FileAlignment member of the IMAGE_OPTIONAL_HEADER).

  3. Use the determined maximum as the file offset to the overlay data.

Note that some installers/file formats do not actually use this calculation, but they instead have a small trailer at the end of file which points to the beginning of the payload. For example, ZIP file format works like that - that's why a self-extracting ZIP can be extracted irregardless of whether the unpacker stub is PE, DOS MZ, ELF, Mach-O or anything else.

Igor Skochinsky
  • 36,553
  • 7
  • 65
  • 115
  • no, this is not sufficient. SizeOfRawData can have any value, even gigabytes large in a small file. It's bound by VirtualSize at that point. Accurately determining the end of the image is not trivial. – peter ferrie May 06 '13 at 19:13
  • @peterferrie What would be a good algorithm? I am actually trying to implement one. pefile does it roughly as suggested here. – Karsten Hahn May 13 '14 at 08:12
  • @Veitch, my answer to your other question now has sample code and a description, which should be what you need. – peter ferrie May 13 '14 at 22:35
0

Another option is the mewrev/pe Go package:

package main

import ( "github.com/mewrev/pe" "io/ioutil" "log" "os" )

func main() { file, e := pe.Open("vs_BuildTools.exe") if e != nil { log.Fatal(e) } y, e := file.Overlay() if e != nil { log.Fatal(e) } ioutil.WriteFile("vs_BuildTools.overlay", y, os.ModePerm) }

Zombo
  • 1
  • 2