Operating System Concepts

Processes, threads, memory, scheduling, deadlocks, file systems & networking fundamentals

Core CS + Interview

Contents

Processes Threads Process vs Thread CPU Scheduling Synchronization Deadlocks Memory Management Virtual Memory Page Replacement File Systems I/O & Storage Networking Basics Linux Essentials

⚙️

Processes

A process is a running instance of a program. It has its own memory space, file descriptors, and resources.

Process States

┌─────────┐ create ──→ │ New │ └────┬────┘ ↓ ┌─────────┐ schedule ┌─────────────┐ │ Ready │ ──────────────→ │ Running │ └────┬────┘ ←────────────── └──────┬──────┘ ↑ preempt / yield │ │ ↓ │ ┌──────────┐ └─── I/O complete ────── │ Waiting │ └──────────┘ │ The process terminates from ↓ Running state ──────────────→ ┌────────────┐ │ Terminated │ └────────────┘

Process Control Block (PCB)

OS maintains a PCB for each process containing:

PID: Unique process identifier
State: Current process state (new, ready, running, waiting, terminated)
Program Counter: Address of next instruction
CPU Registers: Saved register values during context switch
Memory info: Page tables, segment tables
Open files: File descriptor table
Priority: Scheduling priority

Inter-Process Communication (IPC)

Method	Description	Use Case
Pipe	Unidirectional byte stream between parent/child	Shell pipes: `ls \| grep`
Named Pipe (FIFO)	Pipe with a name in filesystem, unrelated processes	Simple IPC between processes
Message Queue	Structured messages via kernel-managed queue	Decoupled communication
Shared Memory	Multiple processes access same memory region	High-speed data sharing (fastest)
Socket	Network or local communication endpoint	Client-server, network apps
Signal	Async notification (SIGTERM, SIGKILL, etc.)	Process control (kill, stop)

🧵

Threads

A thread is a lightweight execution unit within a process. Threads in the same process share memory and resources.

What Threads Share vs Own

Shared (per process)

Code section (text)
Heap memory
Global variables
File descriptors
Signal handlers

Own (per thread)

Thread ID
Stack (local variables)
Program counter
CPU registers
Thread-local storage

Threading Models

Model	Description	Example
1:1 (Kernel threads)	Each user thread maps to one kernel thread	Linux pthreads, Java threads
N:1 (User threads)	Many user threads → one kernel thread	Green threads (early Java)
M:N (Hybrid)	M user threads → N kernel threads	Go goroutines, Erlang processes

⚖️

Process vs Thread

Aspect	Process	Thread
Memory	Separate address space	Shared address space
Creation cost	Expensive (fork)	Cheap
Context switch	Slow (TLB flush, page tables)	Fast (shared memory)
Communication	IPC (pipes, sockets, shared mem)	Direct memory access
Crash impact	Isolated — one crash doesn't affect others	One crash can kill entire process
Security	Strong isolation	Weak — any thread can corrupt shared data

💡 When to use which?

Processes: Need isolation, security, fault tolerance (e.g., Chrome tabs, microservices)
Threads: Need shared state, high performance, responsive UI (e.g., web servers, UI threads)
Async/Event loop: I/O-bound work without true parallelism (Node.js, Python asyncio)

📅

CPU Scheduling

The OS schedule determines which process/thread runs on the CPU and for how long.

Key Metrics

Throughput: Processes completed per unit time
Turnaround time: Total time from submission to completion
Waiting time: Time spent in the ready queue
Response time: Time from submission to first response

Scheduling Algorithms

Algorithm	Preemptive?	How It Works	Pros / Cons
FCFS	No	First Come, First Served	Simple. Convoy effect (short jobs wait behind long ones).
SJF	No	Shortest Job First	Optimal avg waiting time. Hard to predict burst time. Starvation possible.
SRTF	Yes	Shortest Remaining Time First	Preemptive SJF. Better response. Still prone to starvation.
Round Robin	Yes	Each process gets a time quantum (e.g., 10ms)	Fair, good response. Performance depends on quantum size.
Priority	Both	Higher priority runs first	Important tasks first. Starvation → fix with aging.
Multilevel Queue	Yes	Separate queues for different priority classes	Flexible. Used in real systems (foreground vs background).
CFS (Linux)	Yes	Completely Fair Scheduler — red-black tree, virtual runtime	Fair, O(log n) scheduling. Default in modern Linux.

🔒

Synchronization

When multiple threads/processes access shared resources, we need synchronization to prevent race conditions.

Critical Section Problem

A critical section is code that accesses shared resources. Requirements for a correct solution:

Mutual Exclusion: Only one thread in the critical section at a time
Progress: If no thread is in the CS, one waiting thread must be allowed in
Bounded Waiting: No thread waits forever (no starvation)

Synchronization Primitives

Primitive	Description	Use Case
Mutex (Lock)	Binary lock — only the owner can unlock	Protecting shared data (one writer at a time)
Semaphore	Counter-based — allows N concurrent accessors	Connection pool, rate limiting
Condition Variable	Wait until a condition is true (used with mutex)	Producer-consumer pattern
Read-Write Lock	Multiple readers OR one writer	Read-heavy workloads (caches, config)
Spinlock	Busy-wait loop instead of sleeping	Very short critical sections (kernel)
Atomic operations	Hardware-level indivisible operations (CAS)	Lock-free data structures, counters

Classic Problems

Producer-Consumer: Bounded buffer with semaphores — producer waits if full, consumer waits if empty
Readers-Writers: Multiple readers can read simultaneously; writers need exclusive access
Dining Philosophers: 5 philosophers, 5 forks — avoid deadlock with resource ordering or semaphores

💀

Deadlocks

A deadlock occurs when two or more processes are waiting for each other to release resources, and none can proceed.

Four Necessary Conditions (Coffman)

All four must hold simultaneously for deadlock:

Mutual Exclusion: Resource held exclusively by one process
Hold and Wait: Process holds resource while waiting for another
No Preemption: Resources can't be forcibly taken away
Circular Wait: Circular chain of processes each waiting for the next's resource

Handling Deadlocks

Strategy	How	Trade-off
Prevention	Break one of the four conditions (e.g., ordered resource acquisition)	Restricts system, may reduce concurrency
Avoidance	Banker's algorithm — check if granting resource is safe	Requires advance knowledge of max needs
Detection & Recovery	Build wait-for graph, detect cycles, kill a process	Overhead of detection; may lose work
Ignore (Ostrich)	Assume deadlocks are rare, reboot if it happens	Used in most real systems (Linux, Windows)

🧠

Memory Management

Memory Layout of a Process

High Address ───────────────────────────── │ Stack │ ← grows downward (local vars, return addr) │ ↓ │ │ (free space) │ │ ↑ │ │ Heap │ ← grows upward (malloc, new) │──────────────────────────── │ │ Uninitialized Data (BSS) │ ← global vars = 0 │ Initialized Data (.data) │ ← global vars with values │ Text (Code) │ ← compiled machine code (read-only) Low Address ─────────────────────────────

Memory Allocation

Type	Where	Lifetime	Speed
Stack	Stack segment	Function scope (auto)	Very fast (pointer bump)
Heap	Heap segment	Until explicitly freed	Slower (malloc/free overhead)
Static/Global	Data/BSS segment	Entire program lifetime	Allocated at load time

Fragmentation

External fragmentation: Free memory scattered in small blocks — total is enough but no single block fits. Fix: compaction, paging.
Internal fragmentation: Allocated block is larger than needed — wasted space inside. Fix: smaller allocation units.

💾

Virtual Memory

Each process gets its own virtual address space mapped to physical memory via page tables.

How It Works

Virtual address space divided into fixed-size pages (typically 4 KB)
Physical memory divided into frames (same size as pages)
Page table maps virtual pages → physical frames
TLB (Translation Lookaside Buffer): CPU cache for recent page table entries — speeds up translation

Page Fault

When a process accesses a page not in physical memory:

CPU generates a page fault interrupt
OS checks if the access is valid (segfault if not)
OS finds a free frame (or evicts a page using replacement algorithm)
OS reads the page from disk into the frame
OS updates the page table and restarts the instruction

Benefits

Isolation: Each process has its own address space — can't corrupt other processes
Larger than physical: Programs can use more memory than physically available (swap to disk)
Shared memory: Multiple processes can map the same physical frame (shared libraries)
Lazy loading: Pages loaded only when first accessed (demand paging)

🔄

Page Replacement Algorithms

When physical memory is full and a new page is needed, which page do we evict?

Algorithm	How It Works	Pros / Cons
FIFO	Evict the oldest page	Simple. Belady's anomaly (more frames can cause more faults).
Optimal (OPT)	Evict page not used for longest time in future	Best possible. Impossible in practice (requires future knowledge).
LRU	Evict least recently used page	Close to optimal. Expensive to implement perfectly.
Clock (Second Chance)	FIFO with reference bit — give each page a second chance	Good approximation of LRU. Used in real systems.
LFU	Evict least frequently used page	Works well for stable access patterns.

🔑 Thrashing: When a process spends more time swapping pages than executing. Happens when working set exceeds available frames. Fix: increase memory, reduce multiprogramming, use better page replacement.

📂

File Systems

Common File Systems

FS	OS	Max File	Features
ext4	Linux	16 TB	Journaling, extents, default Linux FS
XFS	Linux	8 EB	High performance, parallel I/O
NTFS	Windows	16 EB	ACLs, compression, encryption
APFS	macOS	8 EB	Copy-on-write, snapshots, encryption
ZFS	Multi	16 EB	Data integrity, snapshots, RAID built-in

Key Concepts

inode: Metadata structure — stores permissions, timestamps, block pointers (not the filename)
Directory: Maps filenames → inode numbers
Hard link: Multiple filenames pointing to same inode (same FS only)
Soft link (symlink): Pointer to a pathname (can cross FS, can break)
Journaling: Write changes to a log first, then apply — survives crashes

💿

I/O & Storage

I/O Models

Model	Description	Used By
Blocking I/O	Thread waits until I/O completes	Traditional (one thread per connection)
Non-blocking I/O	Returns immediately if not ready, poll again	Busy-wait loop
I/O Multiplexing	`select()`/`poll()`/`epoll()` — wait on multiple FDs	Nginx, Node.js, Redis
Async I/O	Kernel notifies when I/O is done	io_uring (Linux), IOCP (Windows)

Storage Comparison

Type	Speed	Cost	Use Case
RAM	~100 ns	$$$	Active data, caches
NVMe SSD	~10 μs	$$	Databases, OS
SATA SSD	~50 μs	$$	General storage
HDD	~2 ms	$	Archives, backups

🌐

Networking Basics

OSI Model (7 Layers)

#	Layer	Protocol	What It Does
7	Application	HTTP, DNS, FTP, SMTP	User-facing protocols
6	Presentation	SSL/TLS, JPEG	Encryption, encoding, compression
5	Session	NetBIOS, RPC	Session management
4	Transport	TCP, UDP	Reliable delivery, ports
3	Network	IP, ICMP	Routing, IP addressing
2	Data Link	Ethernet, Wi-Fi	MAC addresses, frames
1	Physical	Cables, radio	Bits on the wire

TCP vs UDP

Feature	TCP	UDP
Connection	Connection-oriented (3-way handshake)	Connectionless
Reliability	Guaranteed delivery, ordering, retransmit	No guarantees (fire-and-forget)
Speed	Slower (overhead)	Faster (minimal overhead)
Use case	Web (HTTP), email, file transfer	Video streaming, gaming, DNS

DNS Resolution Flow

Browser → Local DNS Cache → OS DNS Cache → Recursive Resolver → Root Server (.com) → TLD Server (google.com) → Authoritative Server → Returns IP → Browser connects via TCP

HTTP/HTTPS

HTTP/1.1: One request per connection (keep-alive helps), head-of-line blocking
HTTP/2: Multiplexing (multiple requests on one connection), header compression, server push
HTTP/3: QUIC (UDP-based), faster connection setup, no head-of-line blocking
HTTPS: HTTP + TLS. Encrypts data in transit. Certificate-based trust.

🐧

Linux Essentials

# Process management
ps aux                    # list all processes
top / htop                # interactive process viewer
kill -9 <pid>             # force kill process
kill -15 <pid>            # graceful termination (SIGTERM)
nohup ./app &             # run in background, survives logout
jobs / fg / bg             # job control

# Memory & Disk
free -h                   # memory usage
df -h                     # disk space
du -sh /path              # directory size
lsof -i :8080             # what's using port 8080

# File permissions
chmod 755 file            # rwxr-xr-x
chown user:group file     # change ownership
# r=4 w=2 x=1 → owner|group|other

# Networking
netstat -tlnp             # listening ports
ss -tlnp                  # modern netstat
curl -v https://api.com   # HTTP request with headers
dig google.com            # DNS lookup
traceroute google.com     # network path

# Useful combos
grep -r "pattern" /path   # recursive search
find / -name "*.log" -mtime -1  # files modified in last day
tail -f /var/log/app.log  # follow log in real-time
awk '{print $1}' file     # print first column
wc -l file                # count lines