reading binaries by hand ELF PART 1 ELF PART 2 ELF PART 3
RSS github
github RSS
PE32+WindowsFormat-HexBiber

Walking a PE32+ File by Hand

Okay, in this part I will start with PE files. I am using limon.exe and Windows Format-Hex because I want to see the bytes directly.

This post is mostly my own parsing notes while building Biber.

Main idea: PE parsing is not just reading bytes in order. The headers tell us where the next important structure lives. Many values are RVAs, so we need the section table to convert them into file offsets.

PE Layout in My Head

Before reading real bytes, I want a simple map. This is the order I keep in mind while walking through a PE file.

PE file layout
0x0000
DOS Header
↓
DOS Stub
↓
PE Signature
↓
COFF Header
↓
Optional Header
↓
Data Directories
↓
Section Table
↓
Section Raw Data

DOS header starts with MZ. The DOS stub can vary. The PE signature is always 50 45 00 00, meaning PE\0\0.

header sizes and important tables
DOS HEADER
size is 64 byte is is standart

DOS_STUB
"This program cannot run in DOS mode" size can vary

PE_SIGNATURE
50 45 00 00 size is not changing..

COFF Header — IMAGE_FILE_HEADER
File meta datas
20 byte standart

Optional Header — IMAGE_OPTIONAL_HEADER
Loader datas
PE32  → 224 byte (0xE0)
PE32+ → 240 byte (0xF0)

Data Directories
16 × 8 = 128 byte

Section Table — IMAGE_SECTION_HEADER
40 byte × NumberOfSections

For now, I care most about COFF Header, Optional Header, Data Directories and the Section Table.

PartMeaning
DOS HeaderStarts with MZ and contains e_lfanew.
COFF HeaderFile metadata: machine, section count, timestamp, optional header size.
Optional HeaderLoader data: entry RVA, image base, alignment, subsystem and directories.
Data DirectoriesRVA + size pairs for special tables like import, export, reloc, debug.
Section TableMaps memory RVAs to raw file offsets.
Data Directory names
EXPORT
IMPORT
RESOURCE
EXCEPTION
SECURITY
BASERELOC
DEBUG
ARCHITECTURE
GLOBALPTR
TLS
LOAD_CONFIG
BOUND_IMPORT
IAT
DELAY_IMPORT
COM_DESCRIPTOR
RESERVED

PowerShell Helpers

While reading PE files by hand, hex calculations come constantly. I added two small helpers to my PowerShell profile so I can calculate offsets and check ASCII values without leaving the terminal.

PowerShell profile helpers
function hex {
    param(
        [Parameter(ValueFromRemainingArguments=$true)]
        $Value,

        [switch]$d
    )

    $expr = ($Value -join ' ')

    if ($expr) {
        $result = Invoke-Expression $expr
    }

    if ($d) {
        [uint64]$result
    }
    else {
        "0x{0:X}" -f ([uint64]$result)
    }
}

function ascii($hex) {

    $hex = "$hex"

    $hex = $hex -replace '^0x',''
    $hex = $hex -replace '[^0-9A-Fa-f]',''

    if ($hex.Length % 2) {
        throw "Hex length must be even"
    }

    $bytes = for ($i=0; $i -lt $hex.Length; $i += 2) {
        [Convert]::ToByte($hex.Substring($i,2),16)
    }

    [Text.Encoding]::ASCII.GetString($bytes)
}

Set-Alias asc ascii

For example, I can do hex (0x90+0xF0) or asc 4d5a directly while following the file.

DOS Header, MZ and e_lfanew

First I read the beginning of limon.exe. At offset 0x00, I expect MZ.

Format-Hex — MZ
PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0x00 -Count 16

   Label: C:\Users\Hrasi\Desktop\Biber\limon.exe

          Offset Bytes                                           Ascii
                 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
          ------ ----------------------------------------------- -----
0000000000000000 4D 5A 78 00 01 00 00 00 04 00 00 00 00 00 00 00 MZx �   �

PS C:\Users\Hrasi\Desktop\Biber> asc 4d5a
MZ

Then I read e_lfanew at offset 0x3C. This gives the file offset of the PE signature.

Format-Hex — e_lfanew and PE signature
PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0x3C -Count 4

   Label: C:\Users\Hrasi\Desktop\Biber\limon.exe

          Offset Bytes                                           Ascii
                 00 01 02 03
          ------ -----------
000000000000003C 78 00 00 00

PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0x78 -Count 4

   Label: C:\Users\Hrasi\Desktop\Biber\limon.exe

          Offset Bytes                                           Ascii
                 00 01 02 03
          ------ -----------
0000000000000078 50 45 00 00                                     PE

Here e_lfanew = 0x78, and at 0x78 we find 50 45 00 00. So the file has the expected PE signature.

COFF Header

The PE signature is 4 bytes, so the COFF header starts at 0x78 + 4 = 0x7C. The COFF header is 20 bytes.

Format-Hex — COFF Header
PS C:\Users\Hrasi\Desktop\Biber> hex (0x78+4)
0x7C

PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0x7C -Count 20

   Label: C:\Users\Hrasi\Desktop\Biber\limon.exe

          Offset Bytes                                           Ascii
                 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
          ------ ----------------------------------------------- -----
000000000000007C 64 86 07 00 B5 4E 4F 18 00 00 00 00 00 00 00 00 d�� µNO�
000000000000008C F0 00 22 00                                     ð "

0x7C  64 86        Machine              = 0x8664 (x86-64)
0x7E  07 00        NumberOfSections     = 0x0007
0x80  B5 4E 4F 18  TimeDateStamp        = 0x184F4EB5
0x84  00 00 00 00  PointerToSymbolTable = 0
0x88  00 00 00 00  NumberOfSymbols      = 0
0x8C  F0 00        SizeOfOptionalHeader = 0x00F0
0x8E  22 00        Characteristics      = 0x0022

This is where NumberOfSections and SizeOfOptionalHeader become important. We have 7 sections, and the optional header size is 0xF0, which matches PE32+.

Quick check: NumberOfSections = 7. Every section header is 40 bytes, so the section table will contain 7 × 40 = 280 bytes of section metadata.

Optional Header

The Optional Header starts right after the COFF header. In this file, that is offset 0x90.

Format-Hex — Optional Header magic
PS C:\Users\Hrasi\Desktop\Biber> Format-Hex .\limon.exe -Offset 0x90 -Count 2

   Label: C:\Users\Hrasi\Desktop\Biber\limon.exe

          Offset Bytes                                           Ascii
                 00 01
          ------ -----
0000000000000090 0B 02

0x020B = PE32+

The magic value is 0x020B, so this is PE32+. This matters because PE32 and PE32+ have different field layouts.

Format-Hex — full PE32+ Optional Header
PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0x90 -Count 240

0000000000000090 0B 02 0E 00 00 18 07 00 00 2A 06 00 00 00 00 00
00000000000000A0 C0 8F 00 00 00 10 00 00 00 00 00 40 01 00 00 00
00000000000000B0 00 10 00 00 00 02 00 00 06 00 00 00 00 00 00 00
00000000000000C0 06 00 00 00 00 00 00 00 00 30 0E 00 00 04 00 00
00000000000000D0 00 00 00 00 03 00 60 81 00 00 00 01 00 00 00 00
00000000000000E0 00 10 00 00 00 00 00 00 00 00 10 00 00 00 00 00
00000000000000F0 00 10 00 00 00 00 00 00 00 00 00 00 10 00 00 00
0000000000000100 00 00 00 00 00 00 00 00 50 1D 0C 00 64 00 00 00
0000000000000110 00 00 00 00 00 00 00 00 00 E0 0D 00 88 23 00 00
0000000000000120 00 00 00 00 00 00 00 00 00 20 0E 00 80 03 00 00
0000000000000130 00 60 0C 00 38 00 00 00 00 00 00 00 00 00 00 00
0000000000000140 00 00 00 00 00 00 00 00 F8 1C 0C 00 28 00 00 00
0000000000000150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000000160 50 20 0C 00 98 02 00 00 00 00 00 00 00 00 00 00
0000000000000170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

From this block I extracted the fields I needed first:

Optional Header fields
0x90  0B 02        Magic                 = 0x020B → PE32+
0xA0  C0 8F 00 00  AddressOfEntryPoint   = 0x00008FC0
0xA4  00 10 00 00  BaseOfCode            = 0x00001000
0xA8  00 00 00 40 01 00 00 00
                  ImageBase             = 0x0000000140000000

0xB0  00 10 00 00  SectionAlignment      = 0x1000
0xB4  00 02 00 00  FileAlignment         = 0x200

0xC8  00 04 00 00  SizeOfHeaders         = 0x400
0xC4  00 30 0E 00  SizeOfImage           = 0xE3000

0xD4  03 00        Subsystem             = 3 → Windows CUI / Console
0xD6  60 81        DllCharacteristics    = 0x8160

0xFC  10 00 00 00  NumberOfRvaAndSizes   = 16

The most important values for the next step are AddressOfEntryPoint and ImageBase.

RVA, VA and ImageBase

Now I want to read only the entry point and image base, because PE entry point is not a file offset. It is an RVA.

Entry RVA and ImageBase bytes
PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0xA0 -Count 4

00000000000000A0 C0 8F 00 00

AddressOfEntryPoint = 0x00008FC0

PS C:\Users\Hrasi\Desktop\Biber> hex (0x90+16+4+4)
0xA8

PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0xA8 -Count 8

00000000000000A8 00 00 00 40 01 00 00 00

ImageBase = 0x140000000

This part confused me many times, so I wrote it very directly:

RVA / VA calculation
VA = ImageBase + RVA

ImageBase           = 0x140000000
AddressOfEntryPoint = 0x00008FC0
----------------------------------
VA                  = 0x140008FC0

RVA = VirtualAddress - ImageBase
VA  = ImageBase + RVA

So the runtime entry address becomes 0x140008FC0. But I still cannot use this directly with Format-Hex, because Format-Hex needs a file offset, not a virtual address.

Important: AddressOfEntryPoint is an RVA. It becomes a real virtual address after the loader adds ImageBase. But to read bytes from disk, we must convert that RVA into a raw file offset.

Section Table

To convert RVA to raw file offset, we need the section table.

Section table offset calculation
Optional Header start = 0x90
PE32+ Optional Header size = 0xF0

Section Table offset:
0x90 + 0xF0 = 0x180

Every IMAGE_SECTION_HEADER is 40 bytes:
0x28 bytes

Each section header has this layout:

IMAGE_SECTION_HEADER fields
Name              8
VirtualSize       4
VirtualAddress    4
SizeOfRawData     4
PointerToRawData  4
PointerToReloc    4
PointerToLine     4
NumReloc          2
NumLine           2
Characteristics   4

First section: .text

The first section starts at 0x180. Reading 40 bytes gives the .text header.

Format-Hex — .text section header
PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0x180 -Count 40

   Label: C:\Users\Hrasi\Desktop\Biber\limon.exe

          Offset Bytes                                           Ascii
                 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
          ------ ----------------------------------------------- -----
0000000000000180 2E 74 65 78 74 00 00 00 86 17 07 00 00 10 00 00 .text   ���  �
0000000000000190 00 18 07 00 00 04 00 00 00 00 00 00 00 00 00 00  ��  �
00000000000001A0 00 00 00 00 20 00 00 60                                `

.text
VirtualSize      = 0x00071786
VirtualAddress   = 0x00001000
SizeOfRawData    = 0x00071800
PointerToRawData = 0x00000400
Characteristics  = 0x60000020

Second section: .rdata

The second section starts at 0x180 + 0x28 = 0x1A8.

Format-Hex — .rdata section header
PS C:\Users\Hrasi\Desktop\Biber> hex (0x180+40)
0x1A8

PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0x1A8 -Count 40

   Label: C:\Users\Hrasi\Desktop\Biber\limon.exe

          Offset Bytes                                           Ascii
                 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
          ------ ----------------------------------------------- -----
00000000000001A8 2E 72 64 61 74 61 00 00 E8 28 05 00 00 30 07 00 .rdata  è(�  0�
00000000000001B8 00 2A 05 00 00 1C 07 00 00 00 00 00 00 00 00 00  *�  ��
00000000000001C8 00 00 00 00 40 00 00 40                             @  @

.rdata
VirtualSize      = 0x000528E8
VirtualAddress   = 0x00073000
SizeOfRawData    = 0x00052A00
PointerToRawData = 0x00071C00
Characteristics  = 0x40000040

Everything is as expected: first .text, then .rdata.

RVA to Raw Offset

Now we can finally convert the entry RVA into a file offset using the .text section.

RVA to raw offset
Entry RVA = 0x8FC0

.text VirtualAddress   = 0x1000
.text PointerToRawData = 0x400

file_offset = PointerToRawData + (RVA - VirtualAddress)

file_offset = 0x400 + (0x8FC0 - 0x1000)
file_offset = 0x83C0

So the entry bytes live around file offset 0x83C0.

disk vs memory model
Disk

0x0000 MZ
0x0078 PE
0x0400 .text
0x83C0 entry bytes


Memory

0x140000000 ImageBase
0x140001000 .text
0x140008FC0 entry

jump(ImageBase + AddressOfEntryPoint)

This is the mental model I wanted to build: the same code has a raw file position on disk, an RVA inside the image, and a VA at runtime after adding ImageBase.

Short version: PE gives lots of RVAs. To read those bytes from the file, find the section containing that RVA, then use PointerToRawData + (RVA - VirtualAddress).

Summary

1Read MZ
2Read e_lfanew at 0x3C
3Jump to PE\0\0
4Read COFF Header
5Read Optional Header
6Find ImageBase and AddressOfEntryPoint
7Use Section Table to convert RVA → file offset

Next step will be continuing from these same ideas into imports, data directories, debug directory and more PE tables inside Biber.


Already very near to finish actually Biber o blog ı am going a little more slow sooo soon release is coming ı hope and finally ı can continue with my main idea
make a small kernel and ı can see all fiels of file with Biber :)