Fail2ban for Sendmail AUTH
My server room. I swear. Sorta. (Photo by İsmail Enes Ayhan on Unsplash)
I recently needed to address a growing concern: brute force authentication attempts against the mail server’s mail port for nmrc.org. These attempts not only consume server resources but also pose a security risk as they can be used to get a legit user’s password which might allow for scammers and spammers to relay their evil email messages through the nmrc.org domain, in addition to learning the password for an account on the nmrc.org server itself. After evaluating different options, I settled on Fail2ban as the solution. In this post, I'll walk through the implementation process on the mail server running Sendmail - talon.nmrc.org aka Talon.
Unwanted AUTH Attempts
If you're running a mail server exposed to the internet, you've probably seen log entries like this:
Apr 7 23:18:06 talon sm-mta[55810]: 5384I1Bw055810: AUTH failure (CRAM-MD5): user not found (-20) SASL(-13): user not found: Unable to find a callback: 32775, user=admin@nmrc.org, relay=[122.228.228.86]
That log entry is from an automated attempt to authenticate to the mail server using random or dictionary-based credentials. In addition to increased load, potential for possible malicious relaying, and learning a password for an account, there was another aspect to this that was quite interesting - the source IP address.
One might assume that there would be hundreds if not thousands of these coming from the same IP address over and over again, and there were a few of these that were like this. However there was a separate group of them that were spread out over numerous IP addresses - some from the same class C range, some from a variety of different IPs and different class C ranges. How could I determine this? By examining both the timing and the accounts used. The big break-thru came when I spotted a misspelling in one of the account names being used during the authentication attempt. It seems this account name, which was rather unique on its own but the attacker had a couple of letters switched, was making attempts from roughly a few dozen IP addresses from all over the place. Would the IP addresses repeat? Yes they would, but there was one attempt per IP address every 24 hours. This meant that a conventional Fail2ban-type detection method would not block anything. This suggests a botnet being used to launch a slow but automated dictionary-style password attack. Normally one is expected to set up Fail2ban to start blocking after something like maybe 5 attempts within a 10 minute window, which is typically how admins set up Fail2ban for SSH. That standard type of Fail2ban setup wouldn’t block anything in this case.
The paranoid part of me was also rather nervous. What if this was an attempt to get the account names and passwords themselves? Talon had been under attack from APT actors before - in fact to this day I receive various attacks (usually spear phishing emails) from APT actors at least once every couple of months. So I wanted this to work, if for no other reason than to thwart and hopefully piss off the attackers.
Making Fail2ban Work
Now I did have a few things in my favor. While these automated attacks would typically work on things like ports 25 (SMTP), 587 (secure email submission for sending), and even 993 (IMAPS), both 587 and 993 were already firewalled off from the public. Yes, NMRC users could still send mail but only if they were coming from the internal network or connected to Talon via Wireguard. And as only legit users used those ports, any errors or mistakes they might make would not trigger Fail2ban. For some of the minor exceptions like myself from the internal network, things were set up to allow those private networks.
The mail server is running Ubuntu so installation was the usual:
$ sudo apt update
$ sudo apt install fail2ban
The first thing was to disable SSH from being checked by Fail2ban. As SSH was only allowed via Wireguard and the internal network (plus it was heavily locked down) there was no perceived need to involve Fail2ban. So after /etc/fail2ban/jail.d/defaults-debian.conf was altered it looked like this:
[DEFAULT]
banaction = nftables
banaction_allports = nftables[type=allports]
backend = systemd
[sshd]
enabled = false
The jail configuration file was created at /etc/fail2ban/jail.d/sendmail-auth.conf :
[sendmail-auth]
enabled = true
filter = sendmail-auth
logpath = /var/log/mail.log
# ban on 1 attempt
maxretry = 1
findtime = 600
# 7 day ban
bantime = 604800
action = iptables-multiport[name=sendmail-auth, port="25"]
Next was to create a filter that accurately matched the Sendmail AUTH failure log entries. I altered the /etc/fail2ban/filter.d/sendmail-auth.conf as follows:
[Definition]
failregex = ^.*sm-mta\[\d+\]: \S+: AUTH failure \(\S+\):.* relay=\[<HOST>\]
datepattern = ^MMM\s+\d\s+HH:mm:ss
ignoreregex =
journalmatch = _SYSTEMD_UNIT=sendmail.service
This regex pattern covers the AUTH failures for different authentication mechanisms. Again as legitimate users’ access to mail services through IMAPS (port 993) using Dovecot and sending mail through the submission port (587) were exclusively from the local network or via Wireguard into the server, the whitelist /etc/fail2ban/jail.d/sendmail-auth-whitelist.conf was created. I also added the public address range as a few other public NMRC servers would also use that mail server:
[sendmail-auth]
ignoreip = 127.0.0.0/24 10.0.0.0/8 162.196.226.78/29
Testing was next:
# Check if the regex matches our log entries
fail2ban-regex /var/log/mail.log /etc/fail2ban/filter.d/sendmail-auth.conf
# Validate the configuration
fail2ban-client -d
# Start/reload the service
systemctl restart fail2ban
Monitoring the logs seemed to indicate things were working just fine. Bans started happening almost immediately, which was a good sign.
SUMMATION
In the first 24 hours of getting the configs all working and banning had commenced, 425 IP addresses had been banned. This meant that if that pace continued, the database would hit around 3000 before it started lifting bans. That might seem like a lot, but as Talon is fairly beefy with an Intel i7-1260P 16 core CPU, 64GB RAM, and plenty of drive space, I’m not too concerned. I’ll keep an eye on performance as well as the size of the sqlite3 database for now, and adjust as needed later on. Maybe specifically defining a cron job to do a periodic VACUUM of the database. In the meantime, the size of mail logs is rather reduced as there are far less Sendmail AUTH failures in the logs now, and my paranoia is slightly - slightly - reduced.