The Linux Boot Sequence
The term “boot” is a contraction of “bootstrapping”, originating in the idea of "lifting yourself up by your bootstraps". The boot process is the execution of a series of programs performing incremental steps, each step doing a specific job, and passing control to the next program. The following article focuses on the boot sequence on hardware with the x86 architecture, which predominates in the install base of Linux systems. For specific information related to booting Linux on PPC hardware, see The Linux Boot Sequence: PPC. (Please note also that this article addresses boot sequence characteristics of Ubuntu Linux only through version 6.04 (Dapper); this article is subject to considerable revision in light of changes instituted as of Edgy, and going forward. As of version 6.10 (Edgy), the use of 'init' is being replaced with something called 'upstart'. The migration is still in transition, but a major side effect is that the file /etc/inittab does not exist from Edgy forward.)
The very first step in booting is a "power on system test" (POST). A specialized chip on the motherboard has as its only purpose holding a small program that starts the boot sequence as soon as power is applied to it. This program is the Basic Input Output System (BIOS); it checks a few basic hardware components, such as processor, memory, disk drives, and I/O ports. Configuration of the hardware can be somewhat involved, and you can request presentation of its user interface by Entering_BIOS early in the boot sequence (usually pressing the F1 or F2 function keys, but also F10 or DEL on some systems). Control passes from the BIOS to something called a boot loader.
The boot loader is another small program; rather than being held in nonvolatile memory in a piece of silicon, it is stored in the first 512 bytes of the hard disk. It cannot be stored as a file, since no file system is present until the operating system is able to load. This region of the disk is therefore known as the master boot record (MBR). The BIOS uses POST to obtain the physical layout of the disk, thereby locating the MBR data.
Before flushing itself from memory, BIOS loads the program stored in the MBR into memory, and the processor then begins executing it. The boot loader program itself is independent of any particular operating system, and allows you to boot several systems, through a rudimentary user interface similar to that of BIOS, but rather simpler. Most GNU/Linux distributions ship with a choice of two boot loaders, Lilo (LInux LOader) and GRUB (GRand Unified Bootloader). Since 512 bytes requires very economical code, the boot loader's job is only to load enough of the OS kernel to see things farther on their way. Data for GRUB is located in the filesystem at /boot/grub, but it is in a specific physical location on the disk so that GRUB can find some of it before the kernel has even booted the file system. GRUB proceeds as follows:
- BIOS code reads the MBR from a disk (inspecting the last 2 bytes to verify that it is indeed a MBR).
- Starts executing bootloader code (GRUB stage 1).
- Bootloader jumps to location (sector num) of next stage. This sector num is stored at a particular location in the bootloader "code" at GRUB install time and usually points to a stage 1.5, which is located in the first 30 kilobytes of disk space immediately following the MBR.
- Stage 1.5 knows about the boot filesystem so it opens the filesystem on a partition that is specified at install-time. Stage 1.5 is filesystem-specific code used to find the operating system image on the "boot" filesystem. It looks for the stage 2 executable here and executes it. Note since stage 1.5 knows about the boot filesystem it gives much greater flexibility in upgrading stage 2, the kernel, and so forth, as their new locations don't need to be written to the earlier GRUB stages.
- Stage 2 contains most of the GRUB logic. When GRUB Stage 2 receives control, it presents an interface to the user in order to select which operating system to boot. This normally takes the form of a graphical menu, although if this is not available or the user wishes further control, GRUB has its own command prompt, where the user can manually specify the boot parameters. GRUB can also be set to automatically load a particular kernel after a timeout period. It loads the menu.lst file and executes the statements, usually providing a menu to the user, and so on.
- When GRUB starts booting one of the entries, it reads the initial ramdisk and starts the kernel running, telling it about the ramdisk.
- The initial ramdisk finds the location of the filesystem it itself is on and passes that to the kernel as its root filesystem. This allows for greater flexibility of the devices the kernel resides on.
- The kernel reads its root filesystem and executes /bin/init by default. This in turn parses /etc/inittab which usually sets up the login consoles and starts executing the scripts in /etc/init.d
- These scripts start various subsystems, culminating in starting X. X in turn starts the display manager which gives a login prompt.
The next critical priority is to prepare a file system. Most file systems will be built into the kernel directly, but sometimes are loaded into a RAM disk. The RAM disk (or initdisk or initrd) is a memory-based file system that is loaded by the boot loader before Linux is run. The Linux kernel, as its final act during boot-up, runs init. This is a program normally stored as /sbin/init. It is init, and not the kernel, which then configures the system, starts services (a more correct term for applications such as Apache) and opens virtual terminals. And this is where the real fun begins!
By default, init is located in the filesystem at /sbin/init, and is the first place the kernel will look, followed by /etc/init and /bin/init, in that order. Should all of these fail, a shell (/bin/sh) will be executed instead, enabling you to recover the system. You can also specify a different location for init by passing it as an argument to the kernel.
Once the kernel has started, init prepares the machine to work in a particular way: what software should be running, what terminals should be open, and which network services are to be allowed. It does this using runlevels. Each specific runlevel describes a unique state of the system as it runs. As with most programs under Linux, init has its own configuration file stored in /etc. It is called inittab and consists of configuration data that indicates what, where and when particular services are to be run. The inittab file itself is well commented (lines beginning with # character), and you should familiarize yourself with its contents.
inittab: Configuration of the init Program
The first lines of the inittab file reveal the following:
# The default runlevel. id:2:initdefault:
Each line in inittab has four fields, each separated by a colon. The first portion is an identifier for the action, and can be anything provided it is unique within the file (with the exception of the lines which spawn virtual terminals, which will be covered later). The second indicates the runlevel(s) to which this rule applies. In the above case it means runlevel 2 only, while the case below would work in all “multi-user” levels.
message:2345:wait:echo "In multi-user mode"
Parameter three is called the action. It indicates how the command (given in parameter four) is to be run. There are numerous actions available, and most are used in the inittab provided by most default distributions of GNU/Linux. The common ones are shown in the list below.
- wait – Execute the command, and wait until it completes before moving onto the next one. Used mostly for running software in sequence. If a program can not be run, init will work through the rest of the file. init does not stop on errors, but will report them to the console.
- respawn – Execute the command, but respawn it when the process completes. Used for virtual terminals that need to re-run the login prompt (through getty) whenever the user logs out.
- ctrlaltdel – Whenever the “three fingered salute” is given, run this program. It is usually configured to reboot the machine.
- off – Disables the entry, without deleting it from the file.
- initdefault – The default runlevel used if init is called without an argument.
- sysinit – The command is run first before anything else, when the system boots (runlevels are ignored).
Finally, the executable in parameter four may be a command, script or daemon, and can include arguments if required. But both the command and its arguments may be omitted, as we saw with initdefault.
There are a number of runlevels that are, by convention, used to describe a working Linux system, with 2, 3 or 5 being typical for a multi-user configuration. The complete list is given below (according the Linux Standards Base - LSB).
- Runlevel 0 (halt) - Shuts down everything and brings the system to halt.
- Runlevel 1 (single user) - Useful for maintenance work.
- Runlevel 2 (multi-user) - multi-user, but with no network services exported.
- Runlevel 3 (multi-user) - normal/full multi-user mode.
- Runlevel 4 (multi-user) - reserved for local use. Usually the same as 3.
- Runlevel 5 (multi-user) - multiuser, but boots up into X Window, using xdm, or similar.
- Runlevel 6 (Reboot) - As 0, but reboots after closing everything down.
Notice runlevel 0 is called “halt”. A runlevel describes what services should be running; a runlevel that terminates all services and runs a program to halt the processor leads to a valid state of the system. Given the laws of thermodynamics and of computing, this is by far the most stable state of the system!
Although the kernel starts init at the default runlevel, runlevel can be changed at any time, provided you are the superuser, without rebooting the machine by issuing the following command:
This puts the system into runlevel 2, starting all services that should be running at this runlevel, and killing those that shouldn’t. Using runlevels as a profile in this manner lets you remove services during system maintenance, such as the network, very simply.
Finally, consider the following:
to place the machine in the halting runlevel. This is equivalent to the more descriptive:
shutdown -h now
Although longer, the latter is preferable because it is more flexible; a time argument allows you to shutdown in half an hour, for example.
Resource Configuration (The rc Directory Structure)
init is a simple program. It proceeds, line by line in inittab, executing every command (relevant to the current runlevel) in the order listed in the file. The first command invariably performs system initialization, by specifying the sysinit action.
Its purpose is to configure any system-wide parameters (such as the system clock, or the serial port), regardless of runlevel. Once init has handled the system initialization it switches to the default runlevel and continues reading the rest of the file.
The sysinit start-up code is handled by the /etc/init.d/rcS script, which starts each process that is catalogued in the /etc/rcS.d directory. Since the boot-up sequence doesn’t have a runlevel, a pseudo-runlevel called ‘N’ is used.
When switching between runlevels, the set of running services must also change. While this is possible to do from inside inittab and the /etc directory, it is cumbersome. To ease the pain, Linux uses a set of directories, one per runlevel (called /etc/rc0.d, /etc/rc1.d, /etc/rc2.d etc), that lists some services that must (and some that must not) be active in that specific runlevel. A script called /etc/init.d/rc is then responsible for starting and stopping them. The directory structure and absolute location of rc0.d and init.d can be inferred from the inittab file.
As an example, runlevel 2 will be directed from /etc/rc2.d which contains a number of files. Those beginning with the letter ‘S’ are services that will start when we change into this runlevel, and those beginning with ‘K’ are services that will be killed.
init is clever enough to do things in the correct order established by the system administrator, as well as working efficiently, i.e, not stopping, and then restarting, any service that appears in the both the new runlevel, and the previous one.
This fragment of the /etc/rc2.d directory shows that sysklogd will be started first, followed by kerneld, then gdm, then ppp, and so on. The /etc/init.d/rc script will execute each of these files in order, passing it either the argument “start” or “stop”, depending on whether the filename begins with an ‘S’ or ‘K’.
The sysinit directory /etc/rcS.d works in exactly the same way, but as it contains only system configuration information, no ‘K’ files are required. In contrast, booting up into single-user mode (runlevel 1) has almost nothing except kill files, to stop all the old services.
Using the Symbolic Links in the rcN.d Directories
The files in the /etc/rc2.d directory (or the other rcN.d directories), however, are not actually files. They are links to scripts that do the real work! The script for S10sysklogd, for instance, would reside as /etc/init.d/sysklogd (the S10 having been stripped off) and would start or stop the service based on its argument.
All the startup and shutdown scripts are in this directory (/etc/init.d), so controlling services on-the-fly is very easy. You can stop, start, or restart them with one simple command. Namely,
Some scripts will also support the restart directive. This removes the obligation to kill each process, and restart them manually, whenever a change to the configuration file is made. It also removes the need to reboot after installing new software, since the start command can be called directly.
You can add these links yourself with the ln command:
ln -s /etc/init.d/apache /etc/rc2.d/S20apache
Conclusion (The End Of The Beginning)
Once the various services have started, the inittab script arrives at the actions to configure the three-finger salute (ctrl-alt-delete) and prepare the virtual terminals. These work simply by running the /sbin/getty program (or equivalent) on each specified terminal. The respawn action is required to repopulate the virtual terminals after a user has logged out. If the terminal is connected via modem, inittab can handle that too by using mgetty instead of getty. At this point we have a login prompt, and the real work may begin.
Linux, regardless of version, will always issue a flood of messages to the screen as it starts up. These describe the computer system, as the kernel sees it, to aid troubleshooting. The messages themselves are also written into a log file at /var/log/kern.log, or can be retrieved using the dmesg command. The basiccheck script extracts important information from these messages to aid you in analyzing hardware status and so on, after the system is running.
The apparently simple act of getting a prompt on-screen is lengthy, but it is not as complex as it first appears since it is built from several small steps, each constructed, as it were, on the ruins of the previous one. There is no ghost in the machine.
To give a hexdump of the first sector of the disk (the one containing the MBR) you can issue the command:
dd bs=512 count=1 if=/dev/hda | od -Ax -tx1z -v
(changing /dev/hda as appropriate for your setup)
Lilo, for example, uses /etc/lilo.conf to determine which operating system kernel, or kernels, can be loaded at boot time. This file can also be used to pass options from Lilo into the kernel. In order to create or change the Lilo boot loader you need to edit /etc/lilo.conf and run the /sbin/lilo program. This takes the loader and places it directly into the MBR. The most common change of this kind occurs after compiling a new kernel, as the boot loader needs to know where the new kernel lives. The next evident activity will come from the Linux kernel itself. The kernel does little more than simply initialize itself.