Walking a PE32+ File by Hand
Okay, in this part I will start with PE files. I am using limon.exe and Windows Format-Hex because I want to see the bytes directly.
This post is mostly my own parsing notes while building Biber.
Main idea: PE parsing is not just reading bytes in order. The headers tell us where the next important structure lives. Many values are RVAs, so we need the section table to convert them into file offsets.
PE Layout in My Head
Before reading real bytes, I want a simple map. This is the order I keep in mind while walking through a PE file.
0x0000 DOS Header ↓ DOS Stub ↓ PE Signature ↓ COFF Header ↓ Optional Header ↓ Data Directories ↓ Section Table ↓ Section Raw Data
DOS header starts with MZ. The DOS stub can vary. The PE signature is always 50 45 00 00, meaning PE\0\0.
DOS HEADER size is 64 byte is is standart DOS_STUB "This program cannot run in DOS mode" size can vary PE_SIGNATURE 50 45 00 00 size is not changing.. COFF Header — IMAGE_FILE_HEADER File meta datas 20 byte standart Optional Header — IMAGE_OPTIONAL_HEADER Loader datas PE32 → 224 byte (0xE0) PE32+ → 240 byte (0xF0) Data Directories 16 × 8 = 128 byte Section Table — IMAGE_SECTION_HEADER 40 byte × NumberOfSections
For now, I care most about COFF Header, Optional Header, Data Directories and the Section Table.
| Part | Meaning |
|---|---|
| DOS Header | Starts with MZ and contains e_lfanew. |
| COFF Header | File metadata: machine, section count, timestamp, optional header size. |
| Optional Header | Loader data: entry RVA, image base, alignment, subsystem and directories. |
| Data Directories | RVA + size pairs for special tables like import, export, reloc, debug. |
| Section Table | Maps memory RVAs to raw file offsets. |
EXPORT IMPORT RESOURCE EXCEPTION SECURITY BASERELOC DEBUG ARCHITECTURE GLOBALPTR TLS LOAD_CONFIG BOUND_IMPORT IAT DELAY_IMPORT COM_DESCRIPTOR RESERVED
PowerShell Helpers
While reading PE files by hand, hex calculations come constantly. I added two small helpers to my PowerShell profile so I can calculate offsets and check ASCII values without leaving the terminal.
function hex {
param(
[Parameter(ValueFromRemainingArguments=$true)]
$Value,
[switch]$d
)
$expr = ($Value -join ' ')
if ($expr) {
$result = Invoke-Expression $expr
}
if ($d) {
[uint64]$result
}
else {
"0x{0:X}" -f ([uint64]$result)
}
}
function ascii($hex) {
$hex = "$hex"
$hex = $hex -replace '^0x',''
$hex = $hex -replace '[^0-9A-Fa-f]',''
if ($hex.Length % 2) {
throw "Hex length must be even"
}
$bytes = for ($i=0; $i -lt $hex.Length; $i += 2) {
[Convert]::ToByte($hex.Substring($i,2),16)
}
[Text.Encoding]::ASCII.GetString($bytes)
}
Set-Alias asc ascii
For example, I can do hex (0x90+0xF0) or asc 4d5a directly while following the file.
DOS Header, MZ and e_lfanew
First I read the beginning of limon.exe. At offset 0x00, I expect MZ.
PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0x00 -Count 16
Label: C:\Users\Hrasi\Desktop\Biber\limon.exe
Offset Bytes Ascii
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
------ ----------------------------------------------- -----
0000000000000000 4D 5A 78 00 01 00 00 00 04 00 00 00 00 00 00 00 MZx � �
PS C:\Users\Hrasi\Desktop\Biber> asc 4d5a
MZ
Then I read e_lfanew at offset 0x3C. This gives the file offset of the PE signature.
PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0x3C -Count 4
Label: C:\Users\Hrasi\Desktop\Biber\limon.exe
Offset Bytes Ascii
00 01 02 03
------ -----------
000000000000003C 78 00 00 00
PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0x78 -Count 4
Label: C:\Users\Hrasi\Desktop\Biber\limon.exe
Offset Bytes Ascii
00 01 02 03
------ -----------
0000000000000078 50 45 00 00 PE
Here e_lfanew = 0x78, and at 0x78 we find 50 45 00 00. So the file has the expected PE signature.
COFF Header
The PE signature is 4 bytes, so the COFF header starts at 0x78 + 4 = 0x7C. The COFF header is 20 bytes.
PS C:\Users\Hrasi\Desktop\Biber> hex (0x78+4)
0x7C
PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0x7C -Count 20
Label: C:\Users\Hrasi\Desktop\Biber\limon.exe
Offset Bytes Ascii
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
------ ----------------------------------------------- -----
000000000000007C 64 86 07 00 B5 4E 4F 18 00 00 00 00 00 00 00 00 d�� µNO�
000000000000008C F0 00 22 00 ð "
0x7C 64 86 Machine = 0x8664 (x86-64)
0x7E 07 00 NumberOfSections = 0x0007
0x80 B5 4E 4F 18 TimeDateStamp = 0x184F4EB5
0x84 00 00 00 00 PointerToSymbolTable = 0
0x88 00 00 00 00 NumberOfSymbols = 0
0x8C F0 00 SizeOfOptionalHeader = 0x00F0
0x8E 22 00 Characteristics = 0x0022
This is where NumberOfSections and SizeOfOptionalHeader become important. We have 7 sections, and the optional header size is 0xF0, which matches PE32+.
Quick check: NumberOfSections = 7. Every section header is 40 bytes, so the section table will contain 7 × 40 = 280 bytes of section metadata.
Optional Header
The Optional Header starts right after the COFF header. In this file, that is offset 0x90.
PS C:\Users\Hrasi\Desktop\Biber> Format-Hex .\limon.exe -Offset 0x90 -Count 2
Label: C:\Users\Hrasi\Desktop\Biber\limon.exe
Offset Bytes Ascii
00 01
------ -----
0000000000000090 0B 02
0x020B = PE32+
The magic value is 0x020B, so this is PE32+. This matters because PE32 and PE32+ have different field layouts.
PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0x90 -Count 240 0000000000000090 0B 02 0E 00 00 18 07 00 00 2A 06 00 00 00 00 00 00000000000000A0 C0 8F 00 00 00 10 00 00 00 00 00 40 01 00 00 00 00000000000000B0 00 10 00 00 00 02 00 00 06 00 00 00 00 00 00 00 00000000000000C0 06 00 00 00 00 00 00 00 00 30 0E 00 00 04 00 00 00000000000000D0 00 00 00 00 03 00 60 81 00 00 00 01 00 00 00 00 00000000000000E0 00 10 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00000000000000F0 00 10 00 00 00 00 00 00 00 00 00 00 10 00 00 00 0000000000000100 00 00 00 00 00 00 00 00 50 1D 0C 00 64 00 00 00 0000000000000110 00 00 00 00 00 00 00 00 00 E0 0D 00 88 23 00 00 0000000000000120 00 00 00 00 00 00 00 00 00 20 0E 00 80 03 00 00 0000000000000130 00 60 0C 00 38 00 00 00 00 00 00 00 00 00 00 00 0000000000000140 00 00 00 00 00 00 00 00 F8 1C 0C 00 28 00 00 00 0000000000000150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000000000000160 50 20 0C 00 98 02 00 00 00 00 00 00 00 00 00 00 0000000000000170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
From this block I extracted the fields I needed first:
0x90 0B 02 Magic = 0x020B → PE32+
0xA0 C0 8F 00 00 AddressOfEntryPoint = 0x00008FC0
0xA4 00 10 00 00 BaseOfCode = 0x00001000
0xA8 00 00 00 40 01 00 00 00
ImageBase = 0x0000000140000000
0xB0 00 10 00 00 SectionAlignment = 0x1000
0xB4 00 02 00 00 FileAlignment = 0x200
0xC8 00 04 00 00 SizeOfHeaders = 0x400
0xC4 00 30 0E 00 SizeOfImage = 0xE3000
0xD4 03 00 Subsystem = 3 → Windows CUI / Console
0xD6 60 81 DllCharacteristics = 0x8160
0xFC 10 00 00 00 NumberOfRvaAndSizes = 16
The most important values for the next step are AddressOfEntryPoint and ImageBase.
RVA, VA and ImageBase
Now I want to read only the entry point and image base, because PE entry point is not a file offset. It is an RVA.
PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0xA0 -Count 4 00000000000000A0 C0 8F 00 00 AddressOfEntryPoint = 0x00008FC0 PS C:\Users\Hrasi\Desktop\Biber> hex (0x90+16+4+4) 0xA8 PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0xA8 -Count 8 00000000000000A8 00 00 00 40 01 00 00 00 ImageBase = 0x140000000
This part confused me many times, so I wrote it very directly:
VA = ImageBase + RVA ImageBase = 0x140000000 AddressOfEntryPoint = 0x00008FC0 ---------------------------------- VA = 0x140008FC0 RVA = VirtualAddress - ImageBase VA = ImageBase + RVA
So the runtime entry address becomes 0x140008FC0. But I still cannot use this directly with Format-Hex, because Format-Hex needs a file offset, not a virtual address.
Important: AddressOfEntryPoint is an RVA. It becomes a real virtual address after the loader adds ImageBase. But to read bytes from disk, we must convert that RVA into a raw file offset.
Section Table
To convert RVA to raw file offset, we need the section table.
Optional Header start = 0x90 PE32+ Optional Header size = 0xF0 Section Table offset: 0x90 + 0xF0 = 0x180 Every IMAGE_SECTION_HEADER is 40 bytes: 0x28 bytes
Each section header has this layout:
Name 8 VirtualSize 4 VirtualAddress 4 SizeOfRawData 4 PointerToRawData 4 PointerToReloc 4 PointerToLine 4 NumReloc 2 NumLine 2 Characteristics 4
First section: .text
The first section starts at 0x180. Reading 40 bytes gives the .text header.
PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0x180 -Count 40
Label: C:\Users\Hrasi\Desktop\Biber\limon.exe
Offset Bytes Ascii
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
------ ----------------------------------------------- -----
0000000000000180 2E 74 65 78 74 00 00 00 86 17 07 00 00 10 00 00 .text ��� �
0000000000000190 00 18 07 00 00 04 00 00 00 00 00 00 00 00 00 00 �� �
00000000000001A0 00 00 00 00 20 00 00 60 `
.text
VirtualSize = 0x00071786
VirtualAddress = 0x00001000
SizeOfRawData = 0x00071800
PointerToRawData = 0x00000400
Characteristics = 0x60000020
Second section: .rdata
The second section starts at 0x180 + 0x28 = 0x1A8.
PS C:\Users\Hrasi\Desktop\Biber> hex (0x180+40)
0x1A8
PS C:\Users\Hrasi\Desktop\Biber> format-Hex .\limon.exe -Offset 0x1A8 -Count 40
Label: C:\Users\Hrasi\Desktop\Biber\limon.exe
Offset Bytes Ascii
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
------ ----------------------------------------------- -----
00000000000001A8 2E 72 64 61 74 61 00 00 E8 28 05 00 00 30 07 00 .rdata è(� 0�
00000000000001B8 00 2A 05 00 00 1C 07 00 00 00 00 00 00 00 00 00 *� ��
00000000000001C8 00 00 00 00 40 00 00 40 @ @
.rdata
VirtualSize = 0x000528E8
VirtualAddress = 0x00073000
SizeOfRawData = 0x00052A00
PointerToRawData = 0x00071C00
Characteristics = 0x40000040
Everything is as expected: first .text, then .rdata.
RVA to Raw Offset
Now we can finally convert the entry RVA into a file offset using the .text section.
Entry RVA = 0x8FC0 .text VirtualAddress = 0x1000 .text PointerToRawData = 0x400 file_offset = PointerToRawData + (RVA - VirtualAddress) file_offset = 0x400 + (0x8FC0 - 0x1000) file_offset = 0x83C0
So the entry bytes live around file offset 0x83C0.
Disk 0x0000 MZ 0x0078 PE 0x0400 .text 0x83C0 entry bytes Memory 0x140000000 ImageBase 0x140001000 .text 0x140008FC0 entry jump(ImageBase + AddressOfEntryPoint)
This is the mental model I wanted to build: the same code has a raw file position on disk, an RVA inside the image, and a VA at runtime after adding ImageBase.
Short version: PE gives lots of RVAs. To read those bytes from the file, find the section containing that RVA, then use PointerToRawData + (RVA - VirtualAddress).
Summary
0x3CNext step will be continuing from these same ideas into imports, data directories, debug directory and more PE tables inside Biber.
Already very near to finish actually Biber o blog ı am going a little more slow sooo soon release is coming ı hope and finally ı can continue with my main idea
make a small kernel and ı can see all fiels of file with Biber :)