x86 Crash Course (WIP)

{width=50%}
Registers:
{width=50%}
- General-purpose registers:
EAX,EBX, ECX,EDX,ESI,EDI,EBP,ESP ESIandEDIused for string operationsEBPused as base pointerESPused as the top stack pointerEIP- Accessed implicitly, not explicitly
- Modified by jmp, call, ret
- Value can be read through the stack (saved IP)
EFLAGS register is used basically for control flow decision.
x86 Fundamental data types:
- Byte: 8 bits
- Word: 2 bytes
- Doubleword: 4 bytes (32 bits)
- Quadword: 8 bytes (64 bits)
Moving the value 0 (immediate) to register EAX:
mov eax, 0h0h means 0 in exadecimal
mov [ebx+4h],0hTo move a memory value from one point to another, it is necessary to pass through the CPU by using a register.
mov eax, [ebx]
mov eax, [ebx + 4h]
mov eax, [edx + ebx*4 + 8]Basic instructions
Instruction = opcode + operand
Most important:
-
Data Transfer: mov, push, pop, xchg, lea
-
Integer Arithmetic: add, sub, mul, imul, div, idiv, inc, dec
-
Logical Operators: and, or, not, xor
-
Control Transfer: jmp, jne, call, ret
-
And many more…
-
movdestination , sourceMOV eax, ebxMOV eax, FFFFFFFFhMOV ax, bxMOV [eax],ecxMOV [eax],[ecx]NOT POSSIBLEMOV al, FFh
-
leadestination , source to store the pointer to the memory, not the value
-
adddestination , source makes:dest <- dest + source -
subdestination , source makes:dest <- dest - source -
mulsource : one of the operands is implied (it can beAL,AXorEAX) and the destination can beAX,DX:AX,EDX:EAX(the results could eventually occupy two registers) -
divdivisor : dividend is implied (it’s inEDX:EAXaccording to the size) -
cmpop1, op2 computesop1 - op2and sets the flags -
testop1, op2 computesop1 & op2and sets the flags -
j<cc>address to conditional jumps, reference: http://www.unixwiz. net/techtips/x86-jumps.html -
jmpaddress is unconditional jump -
nopno operation, just move to next instruction. -
intvalue is software interrupt number. -
pushimmediate (or register): stores the immediate or register value at the top of the stack and obviously decrements theESPof the operand size. -
popdestination: loads to the destination a word off the top of the stack and it increasesESPof the operand’s size. -
call: push to the stack the address of the next instruction (not the function called) and move the address of the first instruction of the callee intoEIP -
ret: it’s the opposite ofcallfunction … restores the return address saved bycallfrom the top of the stack. It’s equivalent topop eip. -
leaverestores the caller’s base pointer and it’s equivalent to say:mov esp, ebpandpop ebp… basically you are “deleting” the func’s frame.
Endianness
Endianness refers to the order in which bytes of a data word are stored in memory.
Big endian (left)

Little endian (right)

Program Layout and Functions STACK

PE (Portable Executable): used by Microsoft binary executables • ELF: common binary format for Unix, Linux, FreeBSD and others • In both cases, we are interested in how each executable is mapped into memory, rather than how it is organized on disk.
- PE is used by Microsoft binary executables while ELF is common in Unix, Linux, FreeBSD, and others.
- The focus is on how the executable is mapped into memory rather than how it is organized on disk.
How an executable is mapped to memory in Linux (ELF) ?
| Executable | Description |
|---|---|
| .plt | This section holds stubs which are responsible of external functions linking. |
| .text | This section holds the “text,” or executable instructions, of a program. |
| .rodata | This section holds read-only data that contribute to the program’s memory image |
| .data | This section holds initialized data that contribute to the program’s memory image |
| .bss | This section holds uninitialized data that contributes to the program’s memory image. By definition, the system initializes the data with zeros when the program begins to run. |
| .debug | This section holds information symbolic debugging. |
| .init | This section holds executable instructions that contribute to the process initialization code That is, when a program starts to run, the system arranges to execute the code in this section before calling the main program entry point (called main for “C” programs). |
| got | This section holds the global offset table. |
Stack and heap like always used. The stack pointer is the register ESP. The stack grows towards lower addresses.
EIP is an x86 register that stores the “Extended Instruction Pointer” for the stack. This register directs the computer to the next instruction to execute. Remember that we can’t read or set EIP directly.
The concept of stack frame refers to the stack area allocated to a function: basically the ideas is that each function called has its own area on the stack dedicated to the local variables used by the function.
To refers this variables we used EBP which is called “base pointer” since it points to the start of the function’s frame.
{width=50%}
So the EBP is used to access local variables easily and the local variables stored in stack frame, at lower address than EBP (negative offsets).
Depending on the calling convention EBP may be used to access function arguments which are at a higher address than EBP (positive offsets).
{width=50%}
Calling conventions
- Calling conventions determine the mechanism for passing parameters, either through the stack, registers, or both.
- They also define who is responsible for cleaning up the parameters.
- Additionally, they specify how values are returned from functions.
- Lastly, calling conventions determine which registers are saved by the caller and which ones are saved by the callee.
- Up to two parameters can be passed through two registers (
ECXandEDX) the others are pushed to the stack. - Return is the register
EAX
To debug we use gdb <name> . We use pwndbg which is a GDB plug-in that makes debugging with GDB suck less, with a focus on features needed by low-level software developers, hardware hackers, reverse-engineers and exploit developers.
GitHub - pwndbg/pwndbg: Exploit Development and Reverse Engineering with GDB Made Easy
We also see IDA and Ghidra.