Making updates fail-safe
The next problem to consider is that of recovering from an update that was installed correctly, but which contains code that stops the system from booting. Ideally, we want the system to detect this case and to revert to a previous working image.
There are several failure modes that can lead to a non-operational system. The first is a kernel panic, caused for example by a bug in a kernel device driver, or being unable to run the init program. A sensible place to start is by configuring the kernel to reboot a number of seconds after a panic. You can do this either when you build the kernel by setting CONFIG_PANIC_TIMEOUT or by setting the kernel command line to panic. For example, to reboot 5 seconds after a panic, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access