Usually when I wake up I have three emails from cron on my home server. Two report the status of various automated processes to maintain and the other reports the status of backing-up this blog. If I get only two it usually means that this server has failed in some way; if I get one I am concerned that my corporate server has failed.
This morning however they were all missing. This has happened in the past when the configuration of my home server’s email relay needed to be updated. No such luck this time unfortunately: I logged-in to find that the root filesystem was mounted read-only and the last lines in the syslog were gripes about DMA write failures.
Now this behaviour is remarkably good: in so far as I expected any specific result from a DMA write failure on the root volume I expected a kernel panic. However, you can’t get very far when /var and so forth are on read-only filesystem, so I still needed to replace the disk.
Vexingly, the only spares I had lying around were 5400RPM 2.5″ laptop drives, which are not ideal for fitting in a server tower. A few inches of sellotape later, and the physical part of the job was done.
I found a DVD, burned Ubuntu Server 11.10 (the same version running on the failed disk) to it, and managed to beat my server’s DVD drive into working just about long enough to boot a recovery console. I take backups to another drive in the same system at 24-hour intervals, so I restored the last backup and used grub-install to make it bootable. This got me to the infamous “Error 21″. Back into recovery shell from DVD, tried again with grub itself rather than a wrapper: same result.
Now bored with this cycle, and increasingly concerned about my continued ability to keep the DVD drive spinning, I decided to reinstall Ubuntu from the DVD. This went reasonably well and preserved the contents of my home directory, but lost all my other configurations. It did result in a bootable system, which was good. Apparently 11.10 installed from scratch uses grub 2, whereas my upgraded-to edition had used grub 1.
It took several hours to restore my configuration changes, and I am not sure that I have them all back even now.