Homelab

TL;DR: There may come a time when you need to build a ZFS pool in a temporarily degraded state. Reddit user u/mercenary_sysadmin describes how this is done, and this post provides additional commentary. The approach uses a temporary file in place of the disk, then taking it offline resulting in a degraded pool. When the physical disk is available, the file can be replaced with the actual disk in the pool. My desktop sits idle most of the time as I’ve switched to using my laptop for day-to-day tasks, which seemed like a waste of hardware. Meanwhile, my NAS, running off an 8TB WD Elements drive connected via USB to a ThinkCentre M93p, could use an upgrade. The plan was to set up my desktop as the new NAS with four 8TB WD drives in RAID-Z1. I already have the drives, plus one spare, lying around, so I thought it would be an easy installation, but two of the drives were dead. This left me with three good drives plus the WD Elements, which I then had to shuck. A backup of all critical data was already in place but the restore would have taken too much time, so I made the decision to set up a degraded ZFS pool with the three drives, copy the data over from the WD Elements, shuck it, and then introduce it to the pool. Read more...

Last month, my homelab experienced an incident that resulted in service disruption for 3 hours and 8 minutes, and because I like to cosplay as a sysadmin in my free time this is a post-incident analysis. The effects of the disruption were mostly external, mainly affecting the main site and ytrss, internal systems remained unaffected. A virtual machine hosted on Linode needed to be moved between physical hosts for urgent and unplanned maintenance. A similar situation has happened in the past both as planned and unplanned events, so I had opportunities to prepare, but this still resulted in a failure to automatically recover this time. Several improvements were made to my infrastructure and deployment strategy, which resulted in this disruption being shorter than most but still longer than expected. Read more...