I just posted a brief story about how one of my clients’ machines was recently used for spamming purposes… at fault was the assumed configuration of the E-Mail server software. Check it out here (PDF), on the PaulDotCommunity ‘blog – part of the PaulDotCom empire.
Tuesday, March 25, 2008
FAIL: When software tries to be smrt, and sysadmins trust it.
I run servers for a living… lots of severs, for all sorts of people and customers and workloads. Nothing homogeneous or even enterprisey about most of it.
Probably a year ago, I noticed one of my client’s webserver VPS instances was spewing mail like an open relay. Some quick checking indicated this wasn’t the case, and it wasn’t listed on any RBLs either, so I assumed that some random PHP script was easily pwn3d. Since the client didn’t care about email at all (sigh, why’d you have me turn it on?!), I just shutdown postfix, saw all the SMTP traffic stop and left it to the client to figure out, since they didn’t see fit to have me dig deeper into it, nor could I justify doing it in the absence of financing.
Fast-forward to last week, when said client needed mail turned on. I hesitated and explained why I was reluctant to do this. They assured me that everything had been updated and most of the PHP stuff is gone, aside from a bleeding-edge instance of WordPress. Okay, that’s legit.
I review the config, trash the mail-queue just in case, and fire up postfix.
Nothing (bad) happens instantly, I make note to check it in the morning.
Everything’s okay for the rest of the week, 10msg/day, normal email traffic flow for this client
Yesterday morning though, I notice 7412msg/hr being queued. Eeep.
Killing apache seems to have no effect on the flow.
Reviewing mailq shows it’s all spam or backscatter. Sigh.
I fix the backscatter problem (shame on me), postfix reload, and then just to be sure, do ‘postconf -n’ – and everything looks okay there too.
I continue auditing things running on the machine and don’t see anything out of the ordinary, and yet postfix continues happily to queue spam.
More rummaging turns up nothing other than postfix being the problem.
And then I found it.
[root@bukkit ~]# postconf | grep mynet
mynetworks = 66/8
mynetworks_style = subnet
…
[root@bukkit ~]#
Postfix made a mistake. An ugly one. So ugly, it allowed 1/256th of the IPv4 Internet relay mail via this server, with impunity.
But it was a minor error, one all sysadmins have made in their careers…
It got the subnet mask wrong.
Now, I’m not 100% certain of why this happens, but thanks to the default subnet mask for Class-A networks of which my allocation is part of, it had a flashback to the 1980s and defaulted to a /8.
And since this parameter defaults to being derived at start-time, it doesn’t show up in ‘postconf -n’, which only shows non-defaulted configuration parameters.
Lesson: Don’t trust your software to auto-configure properly every time, and when you’re auditing configurations – check everything, not just non-default settings.
I’ve checked all the other machines I’m responsible for, and haven’t seen this happening, so I’ll be updating this postfix to a later version soon, but at least I’ve hardcoded mynetworks for now.
With apologies to the unintended victims, and the rest of the Internet, for making the spam problem worse – not better.
Mea Cupla.