I/O Systems and Controllers
Learning Goals
- Explain the role of device controllers and how they offload low-level hardware management from the CPU.
- Differentiate between programmed I/O, interrupt-driven I/O, and Direct Memory Access (DMA).
- Trace the full lifecycle of an I/O request from user application to device hardware and back.
- Understand the layered architecture of device drivers in the Linux/Windows kernel.
I/O Hardware: The CPU's View of Devices
The CPU does not interact directly with devices like keyboards, disks, or printers. Instead, each device is managed by a Device Controller — a specialized electronic circuit (on the motherboard or the device itself) that handles low-level hardware details.
Device Controllers
A device controller is responsible for:
- Converting the byte stream from the OS into the electrical signals required by the device.
- Managing the device's internal buffer (e.g., the 512-byte sector buffer on a hard drive).
- Reporting status (busy, ready, error) via status registers.
- Accepting commands (read, write, format) via command registers.
| Component | Purpose |
|---|---|
| Data Register | Holds data being transferred to/from the device (e.g., a byte for a serial port) |
| Status Register | Contains flags: busy, ready, error, interrupt-enabled |
| Command Register | Tells the controller what operation to perform (e.g., READ_SECTOR, WRITE_SECTOR) |
Memory-Mapped I/O vs Port-Mapped I/O
| Feature | Memory-Mapped I/O (MMIO) | Port-Mapped I/O (PMIO) |
|---|---|---|
| How it works | Device registers are mapped into the CPU's memory address space | Device registers accessed via special I/O instructions (IN, OUT in x86) |
| Advantage | No special instructions needed — use regular memory read/write | Does not consume memory address space |
| Used by | Modern systems (ARM, x86 memory-mapped PCI config space) | Legacy x86 devices, embedded systems |
| Example | Reading from address 0xFEC00000 returns the local APIC status | IN AL, 0x60 reads keyboard scancode |
Three I/O Techniques: From Polling to DMA
Historically, three techniques have been used to perform I/O. Each trades simplicity for efficiency.
1. Programmed I/O (Polling)
The CPU does all the work — it continuously checks the device's status register to see if it is ready.
1// Example: Reading a byte from a serial port via polling 2while ((inb(STATUS_PORT) & BUSY) != 0) { 3 // Spin — wait for device to become ready 4} 5byte data = inb(DATA_PORT); // Read the data byte
| Pros | Cons |
|---|---|
| Simple to implement | CPU is wasted — it spins in a loop instead of doing useful work |
| No special hardware needed | Polling frequency must be high enough to avoid missing data |
| Works for very simple/slow devices | Cannot handle high-speed devices like disks efficiently |
2. Interrupt-Driven I/O
The CPU issues a command and then continues executing other processes. The device sends an interrupt when the operation is complete.
1// Example: Reading a sector from disk via interrupts 2// Step 1: Issue the command 3outb(COMMAND_PORT, READ_SECTOR); 4outb(LBA_PORT, sector_number); // Specify which sector 5 6// Step 2: CPU is free to do other work! 7// ... process scheduling, computation, etc. ... 8 9// Step 3: When disk finishes, it raises an interrupt. 10// The Interrupt Handler executes: 11void disk_interrupt_handler() { 12 byte data[512]; 13 for (int i = 0; i < 512; i++) { 14 data[i] = inb(DATA_PORT); 15 } 16 // Signal to the waiting process that data is ready 17}
| Pros | Cons |
|---|---|
| CPU is not wasted — processes execute during I/O | Interrupt overhead — saving/restoring context for every I/O operation |
| Good for slow/medium devices (keyboard, network) | High-speed devices (disk, SSD) generate thousands of interrupts per second |
3. Direct Memory Access (DMA)
For high-speed devices, even interrupt-driven I/O is too slow because the CPU must still copy data between the device buffer and memory byte-by-byte. DMA solves this by letting the device controller transfer data directly to/from RAM without CPU involvement.
How DMA Works (Step by Step):
- The CPU programs the DMA controller with: source address (device buffer), destination address (RAM buffer), and transfer size.
- The CPU is free to execute other processes while the DMA controller manages the transfer.
- The DMA controller takes control of the system bus and transfers data word-by-word from device to RAM (or RAM to device).
- When the full transfer is done, the DMA controller raises a single interrupt to signal completion.
Comparison of I/O Techniques:
| Aspect | Programmed I/O | Interrupt-Driven I/O | DMA |
|---|---|---|---|
| Data path | CPU → Device register | Device → CPU register → RAM | Device → RAM (direct) |
| CPU utilization | Very low (busy-waiting) | Moderate (interrupt per transfer) | High (one interrupt per block) |
| Best for | Simple/slow devices (PS/2 keyboard) | Medium-speed devices (network card) | High-speed devices (disk, SSD, GPU) |
| Overhead | CPU spins continuously | Context switch per interrupt | DMA controller setup + one interrupt |
Direct Memory Access (DMA) — How it Works
Interrupt Handlers: The Kernel's Urgent Response
An interrupt is a hardware signal that tells the CPU that an event requiring immediate attention has occurred. When a device raises an interrupt, the CPU must stop its current execution, handle the device's needs, and then resume.
Interrupt Vector Table (IVT) / Interrupt Descriptor Table (IDT)
Every interrupt type is assigned a unique number (the interrupt vector). The CPU uses this number as an index into a table of handler addresses:
- x86 Real Mode: IVT at address 0x0000:0000 (256 entries × 4 bytes = 1024 bytes).
- x86 Protected Mode: IDT — up to 256 entries, each 8 bytes, location specified by the IDTR register.
| Vector Range | Type | Examples |
|---|---|---|
| 0–31 | Exceptions (internal CPU events) | Divide by zero (0), Page fault (14), General protection fault (13) |
| 32–255 | Interrupts (external device events) | Timer (IRQ0), Keyboard (IRQ1), Disk (IRQ14) |
Maskable vs Non-Maskable Interrupts
| Type | Can the CPU ignore it? | Example |
|---|---|---|
| Maskable | Yes — if interrupts are disabled (cli instruction) | Timer interrupt, disk interrupt |
| Non-Maskable (NMI) | No — always handled immediately | Hardware failure, memory ECC error, watchdog timer |
Top Half vs Bottom Half
Modern OS kernels split interrupt handling into two parts to minimize the time interrupts are disabled:
| Part | Description | Runs with interrupts? |
|---|---|---|
| Top Half (Hardware Interrupt Handler) | Acknowledges the interrupt, saves minimal data, schedules the bottom half. Must be extremely fast. | Interrupts disabled — runs immediately |
| Bottom Half (Softirq / Tasklet / Work Queue) | Performs the heavy processing (e.g., copying data to user buffer, waking up waiting processes). Can be deferred and re-enabled. | Interrupts re-enabled — runs later |
Device Drivers: The Kernel's Device Abstraction
A device driver is a kernel module that understands the specific protocol of a device controller and presents a uniform interface to the rest of the OS.
The Layered I/O Architecture
Key Insight: The VFS allows user applications to issue read() and write() calls without knowing whether the underlying device is a hard disk, SSD, USB drive, or network file system. The driver handles the conversion.
Driver Characteristics
| Aspect | User-Space Driver | Kernel-Space Driver |
|---|---|---|
| Location | Runs in user mode as a separate process | Runs in kernel mode, loaded into kernel space |
| Example | FUSE (Filesystem in Userspace) — e.g., sshfs | Linux SCSI/NVMe drivers, Windows WDDM |
| Advantage | Crash doesn't bring down the OS | Maximum performance, direct hardware access |
| Disadvantage | Slower — context switches required per call | Bug can crash the entire system |
The Lifecycle of a read() System Call — From User Space to Disk
- 1Step 1
A user-space process calls
read(fd, buffer, 512)wherefdis an open file descriptor. The C library translates this into a system call (e.g.,sys_readon Linux viaint 0x80orsyscallinstruction). The CPU switches from User Mode (Ring 3) to Kernel Mode (Ring 0). - 2Step 2
The kernel's VFS layer receives the request. It checks the file descriptor to find which file system (ext4, NTFS, etc.) the file belongs to. It also checks permissions and determines the logical block number(s) on the disk where the file's data is stored.
- 3Step 3
The file system (e.g., ext4) looks up the file's inode to find which physical disk blocks hold the data. If the data is not in the page cache, the file system issues a block I/O request to the generic block layer.
- 4Step 4
The block I/O layer receives the request and places it in the I/O scheduler queue (e.g., CFQ, Deadline, NOOP). The scheduler may reorder requests to optimize disk head movement (merging adjacent blocks, sorting by sector number).
- 5Step 5
The block layer passes the request to the device driver (e.g.,
ahcifor SATA,nvmefor NVMe). The driver programs the DMA controller: source = disk sector, destination = kernel buffer, size = 512 bytes. It writes the command to the device controller's command register and returns. - 6Step 6
The disk controller reads the sector from the spinning platter (or NAND flash), transfers the 512 bytes via DMA directly into the kernel buffer in RAM. When complete, the controller raises an interrupt. The interrupt handler marks the I/O as complete, the waiting process is woken up, and the data is copied from the kernel buffer to the user-space buffer. The read() call returns.
SPOOLING: Sharing Exclusive Devices [2023 Q1c]
SPOOLING stands for Simultaneous Peripheral Operation Online. It is a technique that makes an exclusive (non-sharable) device appear sharable by buffering its output to a high-speed storage device (usually disk).
| Aspect | Detail |
|---|---|
| Full Form | Simultaneous Peripheral Operation Online |
| Mechanism | Instead of each process directly writing to a slow device (like a printer), processes write to a high-speed disk buffer (the spool). A dedicated daemon (e.g., lpd for line printer) reads from the spool and writes to the physical device. |
| Analogy | A restaurant order counter: customers (processes) place orders (output) at a counter (the spool). The chef (daemon) processes orders one at a time from the counter. |
| Benefit | Multiple processes can "print" almost instantly — the slow physical device is never the bottleneck. No process waits for the printer. |
| Relation to Deadlocks | Spooling breaks the Mutual Exclusion condition for deadlocks (as discussed in Module 4). |
RAID: Redundant Array of Independent Disks [2024 Q1c]
RAID stands for Redundant Array of Independent Disks. It is a technique that combines multiple physical disk drives into a single logical unit to improve performance and/or reliability.
| Level | Description | Min Drives | Performance | Reliability |
|---|---|---|---|---|
| RAID 0 | Striping — data split across disks | 2 | Excellent read/write | None — any drive fails, all data lost |
| RAID 1 | Mirroring — exact copy on two disks | 2 | Good read, slower write | Excellent — one drive can fail |
| RAID 5 | Striping + distributed parity | 3 | Good read, moderate write | Good — one drive can fail; parity rebuilds data |
| RAID 6 | Striping + dual distributed parity | 4 | Good read, slower write | Very good — two drives can fail |
| RAID 10 | Mirroring + striping (combination) | 4 | Excellent | Excellent — multiple failures tolerated |
Key Insight: RAID 0 provides performance (parallel I/O) but zero redundancy. RAID 1 provides redundancy by duplication. RAID 5/6 provide redundancy with less space overhead by using parity calculations.
CLV vs CAV: Disk Rotation Strategies [2023 Q8a]
| Feature | Constant Angular Velocity (CAV) | Constant Linear Velocity (CLV) |
|---|---|---|
| Rotation Speed | Disk rotates at constant RPM regardless of which track is being read | Disk rotation varies — slower for outer tracks, faster for inner tracks |
| Data Density | Outer tracks have more sectors than inner tracks (same angle, more circumference) | All tracks have the same number of sectors — density is higher on inner tracks |
| Data Transfer Rate | Higher on outer tracks (more data passes under the head per rotation) | Constant across all tracks — transfer rate is uniform |
| Used By | Hard disk drives (HDDs) — modern HDDs use zone-bit recording (ZBR), a hybrid | Optical drives (CD/DVD), old floppy disks |
| Seek Complexity | Simple — just position the head | Complex — must also adjust rotation speed |
I/O Buffering and Caching
To improve I/O performance, the OS uses multiple levels of buffering and caching.
| Technique | Description | Benefit |
|---|---|---|
| Page Cache | Caches file data blocks in RAM | Subsequent reads hit RAM instead of disk |
| Buffer Cache | Caches disk blocks in RAM (separate from page cache in older kernels) | Reduces physical disk I/O |
| Double Buffering | Two buffers: one being filled by device, one being read by the process | Prevents the process from waiting when it's ready for the next block |
| Spooling | Queue of output jobs for a shared device (printer) | Allows multiple processes to 'print' without waiting — the spooler daemon manages the physical device |
Knowledge Check
Which I/O technique allows the CPU to execute other processes while a data transfer between device and memory takes place without CPU involvement?