Process
First, let’s clarify some definitions:
- A program is a set of computer instructions stored and not currently being executed.
- A process is an instance of a program that is currently being executed and has its own isolated memory address space.
- Task refers to a single unit of computation and is often used interchangeably with “thread”/process in Linux. A task is made of:
- unique program counter called
PID
- two stacks (one in user mode and one in kernel mode)
- a set of processor registers
- if kernel, an optional address space
- unique program counter called
A Task Control Block (TCB), also referred to as a Process Control Block (PCB) is the generic term used in os theory to refer to the data structure that stores all the information about a process. A TCB typically contains the following information:
- Process Identifier (PID): A unique identifier for the process.
- Process State: The current state of the process (e.g., running, waiting, blocked).
- Program Counter: The address of the next instruction to be executed.
- CPU Registers: Includes general-purpose registers, stack pointers, and program counters.
- Memory Management Information: Information about the memory allocated to the process, including base and limit registers or page tables.
- Accounting Information: Includes process execution times, user and kernel mode times, and other performance metrics.
- I/O Status Information: Information about I/O devices allocated to the process, open file descriptors, etc.
- Scheduling Information: Priority of the process, scheduling queue pointers, and other scheduling-related data.
- Other Information: Security credentials, pointers to the process’s parent process, signal handling information, etc..
In the context of Linux, the PCB is implemented as the task_struct
:
task_struct attributes |
---|
State {R (Running), I (Interruptible), U (Uninterruptible), D (Disk Sleep)} |
PID (Process ID) |
PPID (Parent Process ID) |
mm (Memory Management) |
fs (Filesystem Context) |
files (Open Files List) |
signal (Signal Handlers) |
thread_struct (Processor Specific Context) |
When a context switch occurs (i.e., the CPU switches from executing one thread to another), the kernel uses the information in the task_struct
to save the state of the current thread and restore the state of the next thread to run.
This is the state machine of a process:
Linux treats all threads as standard processes. Each thread has a unique task_struct
and appears to the kernel as a normal process, threads just happen to share resources, such as address space with other processes.
System calls and exception handlers are well-defined interfaces into the kernel. A process can begin executing in kernel space only through one of these interfaces, all access to the kernel is through these interfaces.
fork()
is a system call in Linux used to create a new process. It duplicates the current process, known as the parent, to create a child process. The new task_struct
of the child process is a copy of the parent’s, with differences in:
- PID (Process ID): Unique identifier for the new process.
- PPID (Parent Process ID): Set to the PID of the parent process.
- Resources: Some resources are duplicated or shared under certain conditions.
System Initialization with systemd
Systemd is a system and service manager for Linux, operating as PID 1:
- It initializes the system, manages services, mounts HDDs, and handles clean-up.
- Replaces traditional init systems like SystemV with a more efficient and unified approach.
- Configuration files in declarative language called “unit files”
- Unit files are plain text, INI-style, encoding information about services, sockets, devices, mount points, automount points, etc.
Systemctl is command-line tool for querying and controlling the systemd system and service manager:
- Start, stop, and query the status of services.
- Manage system resources and services effectively.