Server Upgrades
It should come as no surprise for a person administering servers that performing operating system upgrades is a big deal. It seems slightly more intense when said servers offer up services for others - especially publicly. I recently did this to several systems, and here was what I dealt with.
A Quick Inventory
I’ve blogged about them before, but I have physical servers that for the most part have a primary function and a secondary function involving the public Internet with static IPs. These servers mainly perform functions for the nmrc.org domain. The servers were running Ubuntu Pro Server 22.04.3 (Pro is free for five servers), but I wanted them “current” on 24.04. The four servers and their purposes are listed here:
Daemon. Once the main DNS server, its primary function is running Pi-hole in recursive mode and serving as the main DNS server for all other public servers as well as all internal systems. The two secondary functions are handling rsyslog for the public systems as well as functioning as a Veilid node.
Blackhole. This is a GitLab standalone instance. It is used to handle NMRC coding and project management, and that’s pretty much it.
Talon. Primarily it is the mail server for the nmrc.org domain. Secondary service is running a web service solely to handle redirects, due to some organizations not knowing how to handle DNS properly (and to help with Let’s Encrypt certificates).
Rigor-mortis. This has a few functions. Primarily it is the web server for the NMRC domain as well as the NMRC Mastodon instance. Secondary services include web and mail services for a few smaller domains.
Daemon
I had already upgraded a couple of non-public servers running Ubuntu 22.04 to Ubuntu 24.04, and it seemed to go rather smooth. I thought Daemon would possibly be easy, and it was. No hiccups, its main services handled everything just fine.
Blackhole
This server running its GitLab instance was a bit of a different story. It seemed to go just fine, but then GitLab wouldn’t start after the reboot. A bit of exploring I discovered that there were problems getting gitlab-runsvdir.service
to start, a listing via systemctl list-jobs
showed it and several other jobs were in a state of "waiting" except for plymouth-quit-wait.service
, so stopped that and GitLab started. I of course realized this would resurface if I rebooted, so after a bit of googling I removed splash
from /etc/default/grub
then ran grub-mkconfig -o /boot/grub/grub.cfg
and rebooted. Without splash
there was no “trigger” to invoke plymouth
. Besides, I didn’t need any type of splash screen with logging in on a console boot anyway. Problem solved. I know I covered this in a paragraph, but this wasn’t instantly resolved, in fact this little problem took more than an hour.
Talon
I really thought Talon would be as easy as Daemon, and it actually seemed like it was. However mail was not flowing. I checked and it seemed sendmail.service
was disabled somehow during the upgrade. This struck me as weird, but a few systemctl
commands later and it was fine. Or so I thought. It was a day later when I discovered via log files (looking at something unrelated) that there was a lot of references to OpenDKIM failing as each email came in. Due to misreading an error message and the problems with Sendmail before, I thought the problem was with Sendmail and not OpenDKIM.
A couple of hours later I figured out the problem. When doing systemctl status opendkim
it showed "active" in green and I kept going, because I was mainly looking for the active green text and I assumed all was okay. Later on I was looking at processes as well as open ports (OpenDKIM was configured to listen to port 8891 on locahost), it was obvious OpenDKIM wasn’t running. I then discovered it said "active (exited)" in green, instead of “active (running)”, which clearly showed the problem. It seems the upgrade process deleted the OpenDKIM package but left all of the configuration files and the .service
file. Running journalctl -xu opendkim
confirmed that the problem started with the upgrade. After backing up OpenDKIM’s config files I reinstalled it and copied the config files back in, and finally OpenDKIM was working like it should.
Rigor-mortis
This was the one I was worried the most about originally. This server runs a lot of different things, and I expected there to be some type of problem with at least one of the components. I checked to see if /etc/default/grub
looked like it might be a problem, and although plymouth
was running I did not think it would be an issue but I’d keep an eye on it. As Sendmail was running for a couple of other rarely used domains, OpenDKIM was there but I knew I could handle that. My main worry was the Mastodon instance, as I’d had issues before.
I did backups and did the migration. I had to do the exact same things for Sendmail and OpenDKIM that I did for Talon and that was resolved quickly. Mastodon, as expected, was a bit of a pain. I was on the 4.1.19 version, but as multiple underlying packages were updated (such as Ruby, PostgreSQL, etc) I opted to go ahead and bump up Mastodon to the latest non-beta version, which was 4.2.12. I’ll not pass on all of my steps as I don’t want to mislead or confuse anyone going through the same thing, but in no particular order here were some of the highlights of problems and their outcomes:
PostgreSQL 15 to 16 migration of the database, including refreshing the collation version from 2.35 to 2.39 for databases
postgres
andmastodon
.Disabling/purging PostgreSQL 15 post-migration and making sure PostgreSQL 16 was up and op listening on the correct port (it wasn’t).
Installing a lot of missing/outdated ruby gems.
Rewriting the startup scripts for Mastodon to handle the different location of where I had it installed.
Migrating all of the data (user accounts, feeds, etc) from the 4.1.19 backups to the new 4.2.12 Mastodon, followed by rebuilding.
Migration of database (via
RAILS_ENV=production bundle exec rails db:migrate
) which seemed to resolve remaining errors..
If you are doing this upgrade from 4.1.x to 4.2.x, really read all of the release notes for v4.2.0, as there are a lot of things that will need to be upgraded and in general considered as you move forward. Best example of this was the whole db:migrate
thing, which before I discovered that from the release notes I easily spent two hours chasing my tail (yes that’s a pun from me running tail -f
on various log files). Surprisingly, the steps I took in the last server upgrade served as an important guide, at least in defining my steps. My best advice is to do the migration from 4.1.x to 4.2.0 first, get that working, and then upgrade to the latest 4.2. Only then start to worry about any operating system. You mileage will most likely vary, but right now mine is up, and it works.
By the way, if you’re wanting to install your own Mastodon instance on Ubuntu 24.04, check out the link here, and to increase the toot size to something larger than the stock 500 character limit, check out the link here.
Final Notes
The last thing was annoyance. There is apparently a new method of handling apt
things so that the normal (for example) duosecurity.list
file is backed up to duosecurity.list.distUpgrade
but a new duosecurity.sources
file is also created. This happens for all of the non-Ubuntu repositories. However, if there are any errors during this migration the line Enabled: no
is added to the new .sources
file. The errors can include lack of a noble
repository (vs a jammy
one) and lack of a Signed-By:
line in the .sources
file with a local path to GPG public key for the repository. Basically if there was a noble
repository I fixed the .sources
file, otherwise I simply edited and renamed the .list.distUpgrade
to its old .list
working structure. This was left as a “reminder” for me in case of future edits in that directory to periodically check for updated repositories.
In general, this was not an easy task. No, I am not happy with the direction Canonical seems to be taking with Ubuntu, but as Canonical’s choices are more impactful to the desktop versions than the server versions, I’m sticking with Ubuntu Server 24.04 LTS for now (I currently run Pop_OS for Linux desktops). I had considered moving to Debian, as the latest version shows much more promise, however the one Debian server I am running (mainly for Home Assistant) has been the least “stable” of the servers both public and private I maintain, so I’ll stick with Ubuntu for the foreseeable future.