1

i created some compiled binaries using different languages and tried to understand them using x64dbg, the compiled binaries produced by the c language/gcc compiler were pretty easy to understand, then i compiled a simple hello world program in python using pyinstaller, the output exe file was larger and was messier, i could not understand any of the binary code in x64dbg, can anyone help me understand it? or provide some resources?, i am learning reverse engineering by compiling my own code and reversing it.

Praveen
  • 11
  • 2

2 Answers2

2

On macOS at least, the Python modules are prepended as zlib streams to the executable. I recommend you give a try at pyinstxtractor — using a recent version of Python — to attempt to decompress the zlib streams as .pyc files then a bytecode decompiler like pycdc.

Ninja Inc
  • 161
  • 2
2

PyInstaller binaries are basically self extracting archives that contain compiled Python code for the program and its dependencies.

The extraction code and also some of these libraries may be native binary files.

However, pure Python code does not compile into native assembly but into an IR that the Python runtime, which is included in the package, can run. As mentioned, these files have a .pyc extension.

This is marshalled code, in Python terms, and it can be un-marshalled into its IR representation using the builtin marshal module.

Loading a PyInstaller executable in a RE tool will only produce a generic archive-extraction code that is unrelated to the actual program code.

Yotamz
  • 1,207
  • 6
  • 19