Coming soon... Take 2

Short story, my server lost it's user data.

Coming soon... Take 2
Back to the start...

I am an idiot. There I said it. Mostly because of 1 mistake I made. The mistake?
I waited too long to set up proper backups of my server.

As you might have noticed, the blog is a little different. That's because it is actually new. And here's why...

The Incident

Some months ago, one of my SSDs in my server started going Read-Only mode from time to time and sometimes even not being detected by the server. It was the drive containing my video media, so the media server could not read them properly anymore. Not a big issue, it mostly came back after reboots. But at some point it started doing it almost every day.

Fixing The Issue

So I started looking for a replacement disk. I wanted to upgrade the storage anyway. So I bought an 18 TB HDD and started transferring data from the SSD to the HDD. This took a couple of weeks, since the SSD had issues and "disappeared" from the system pretty often during transfer. During this time, my Jellyfin server was pretty much offline.

After it was all transferred I put in the new drive and started up the server again. Everything was fine. Or so I thought.

The Second Issue

After about 12 hours of uptime, the server went unresponsive. I gave it the old reboot and it came back fine. But then it happened again after some hours. Again the reboot fixed it. Now I was worried that there was something wrong with the new drive, which I bought used as I am not a millionaire. While I was searching for resolutions for this issue, I put in a timer that would reboot the server each night/morning at 4AM, which would catch most of these issues.

I was starting to think that the power supply was not giving enough power for the HDD to spin up. Some AI chat bots helped me run down a few troubleshooting steps with a kernel command here and there, that ultimately did not help.

Then the unthinkable happened. The server went unresponsive, but a reboot did not help at all. So I plugged the server into a screen and a keyboard and started troubleshooting.

The other SSD was now missing from the server devices. That's an issue. The other SSD was where the persistent volumes for docker was held, which means all the container data and metadata. Examples of such data:

  • Jellyfin media metadata, users and watch state
  • Nextcloud users and data
  • Blog contents and members (yes, all members are gone 😦)
  • Movie list entries
  • Home Assistant setup with devices, automations etc.

So I thought: "Well, I'll just recover the data like I did with my media.". So I took out the SSD, the server booted fine after that, so I confirmed that the SSD was the cause of the second issue. Connected the SSD with a SATA to USB adapter I used for the other one and connected the adapter to my PC. Nothing happens. The system does not detect anything has been connected. So I test the adapter with another old drive I have around. Detected instantly. Panic sets in. The drive is dead. All the data on the drive, gone.

This is bad. Because I did not yet, setup backup of the drive, which I had planned to actually setup that week. This devastated us. All that data. Years of our lifes, gone.

The Rescue

The silver lining in this, is that this is also an opportunity to set things up from scratch, with the knowledge I now have, but didn't have when I first setup the server. So I considered what I wanted to keep running and just set up that.

To my wife great relief I said that I found her blog posts on the internet archives Wayback Machine, so I could at least get our blog posts back. I also already had all the media, so all it was missing was the meta data and the users for jellyfin, audiobookshelf etc.

So I started looking into OIDC (OpenID Connect), which is a form of Single Sign-On that Nextcloud can be the provider of. So I set it up in Nextcloud. Luckily I made backups of my Nextcloud folder on my desktop and exported the Calendar and Contact stuff before I started the sync with the "new" Nextcloud server.

You can now use your account on my Nextcloud to also login to my Jellyfin, Audiobookshelf and Mealie (only for my family/household for now) servers.

I am slowly, but steady, getting all my stuff back online and functioning again.

Future Plan

The most valuable lesson from all this? Set up backups of all the data you don't want to loose. So that is my next project. When "rebuilding" the server is complete, I will set up a backup system that backs up the metadata and persistent volumes to an external drive and also to a separate location when I get the option. I'm thinking my parents or my friends place.

I do not want to be in this position again.

So if you were a member of the old blog, please sign up again. If you were a user of one of the old services, please contact me for a new account.