10. How does my computer store things on disk?

When you look at a hard disk under Unix, you see a tree of named directories and files. Normally you won't need to look any deeper than that, but it does become useful to know what's going on underneath if you have a disk crash and need to try to salvage files. Unfortunately, there's no good way to describe disk organization from the file level downwards, so I'll have to describe it from the hardware up.

10.1. Low-level disk and file system structure

The surface area of your disk, where it stores data, is divided up something like a dartboard -- into circular tracks which are then pie-sliced into sectors. Because tracks near the outer edge have more area than those close to the spindle at the center of the disk, the outer tracks have more sector slices in them than the inner ones. Each sector (or disk block) has the same size, which under modern Unixes is generally 1 binary K (1024 8-bit words). Each disk block has a unique address or disk block number.

Unix divides the disk into disk partitions. Each partition is a continuous span of blocks that's used separately from any other partition, either as a file system or as swap space. The original reasons for partitions had to do with crash recovery in a world of much slower and more error-prone disks; the boundaries between them reduce the fraction of your disk likely to become inaccessible or corrupted by a random bad spot on the disk. Nowadays, it's more important that partitions can be declared read-only (preventing an intruder from modifying critical system files) or shared over a network through various means we won't discuss here. The lowest-numbered partition on a disk is often treated specially, as a boot partition where you can put a kernel to be booted.

Each partition is either swap space (used to implement virtual memory or a file system used to hold files. Swap-space partitions are just treated as a linear sequence of blocks. File systems, on the other hand, need a way to map file names to sequences of disk blocks. Because files grow, shrink, and change over time, a file's data blocks will not be a linear sequence but may be scattered all over its partition (from wherever the operating system can find a free block when it needs one).

10.2. File names and directories

Within each file system, the mapping from names to blocks is handled through a structure called an i-node. There's a pool of these things near the ``bottom'' (lowest-numbered blocks) of each file system (the very lowest ones are used for housekeeping and labeling purposes we won't describe here). Each i-node describes one file. File data blocks live above the inodes (in higher-numbered blocks).

Every i-node contains a list of the disk block numbers in the file it describes. (Actually this is a half-truth, only correct for small files, but the rest of the details aren't important here.) Note that the i-node does not contain the name of the file.

Names of files live in directory structures. A directory structure just maps names to i-node numbers. This is why, in Unix, a file can have multiple true names (or hard links); they're just multiple directory entries that happen to point to the same inode.