SPAM, SPAM, SPAM, lovely SPAM? Nope!

Posted by: mstauber Category: Development

I thought I share a recent run in with a pretty nasty SPAM-wave (more like a Tsunami!) that I was getting and how this will help to improve the AV-SPAM.

As you know, our AV-SPAM has been around in one form or other since the Cobalt days. I don't even know when exactly we released the first version of it, but it's been around since 2001 or 2002 or therabouts. During the last 15 years it has seen a vast amount of changes and improvements, never relying just one one method to cope with SPAM, but employing various combined approaches. Which include SpamAssassin, Clam AV, Greylisting, custom rule sets and many minor and major tweaks.

Starting with the AV-SPAM 6.1.0 we also included our own Milter-GeoIP, which allows to reject emails at the MTA level if the originating IP of the sender is geographically located in a country which you added to your blacklist. That Milter-GeoIP also implemented email volume controls, so that you have a visual and automated control via the GUI to see if someone sends undue amounts of emails and can impose limits. Additionally it warns (or suspends accounts) if emails are sent via SMTP-Auth through your server from countries that you have blacklisted. Which aids in detecting compromises really early on, as these logins and the spike in email volume immediately raise a red flag via Active Monitor.

All in all: The AV-SPAM does a good job at keeping your and my inbox relatively clean.

However: Two months ago I started to see an increase in SPAMs that (just barely) made it through. It started as a trickle, but then turned into a Tsunami. On the average I was suddenly getting 70-90 really penetrant SPAMs a day to my work email addresses. Naturally this prompted my curiosity and I started to investigate:

I went through the headers of the SPAMs that made it through (and the ton of other SPAM's that correctly got marked up as such and got quarantied in a separate IMAP-folder by the AV-SPAM).

What I found out pissed me off enough to warrant some extreme measures.

For starters: 98% of the SPAM that was making it through contained clear pointers that they were from the same individual SPAMmer. Now here is the complete idiocy of this SPAMmer in a nutshell: Will sending 70-90 SPAMs a day to the same target email addresses really increase the chance of that person buying any of your shit? Really? I mean: REALLY?!?

Instead it is rather likely that this happens:

The only thing that it managed to piss me off enough to pour considerable effort into tackling this issue. Not just for myself, but also as a future feature for all other AV-SPAM users.

So let us see what our perpetrator did there: He used different email addresses and originating IPs and the SPAMs promoted products and services on a large magnitude of different domains. But a certain pattern was relatively easy to detect. The emails were all different enough that you couldn't just throw in a single simple or complex SpamAssassin rule to catch it all. To the contrary: It was also clear that this boffin was running his SPAMs through SpamAssassin himself to make sure they would score low enough.

Initially I threw in a few custom rules that catch more pharmacy and financial SPAM and flag on some other keywords that I was frequently seeing. It helped, but the Tsunami kept rushing in daily, flooding my inboxes. RBLs? All the RBLs that the AV-SPAM uses (quite a few actually) were usually late to trigger on the IP-addresses that this particular offender was using. So I decided to set up my own RBL.

At first I just created a special DNS zone in my PowerDNS based DNS-cluster and set up a separate password protected web-form, to which I simply added offending IP's manually for entry into my "personal" RBL. This also allowed me to blacklist entire IP address ranges by simply entering 88.88.*.* and it would block 88.88.0.0/16 for example.

That helped a little.

But then it made one thing painfully obvious: Manually adding IP's or IP-address ranges is a lot of work and it often happens too late to stem the tide. So it was time to automate that: I coded an IMAP mailbox parser in Perl and let it run as cronjob. It parses all emails in a separate IMAP folder, extracts the sender IP from the header, stores the header in SQL as evidence and automatically blacklists the offending IP in the RBL:

That brought a small reprive and the Tsunami's daily impact was about halved. It was still enough to royally rile me on a daily level, though.

This not so nice person? He rents servers all over the place. Usually in cheap hosting places which have starting offers like "Your own VPS for $5/month". But he usually got servers that had 10 or more IPs. During any given day he seemed to have 3-5 of these servers in usage and never sent more than two SPAMs from the same originating IP, always cycling through IPs. Once the hosters found out (and cared enough - some clearly don't!) and pulled the plug on that dude, he went on to greener pasture and was back the next day from a fresh box somewhere else.

I could clearly follow him around going through boxes at minor and major hosters in Germany, France, Netherlands, Canada, USA and other countries. So blacklisting him in the RBL simply wouldn't work well enough. It reduced the number of SPAMs if detected early, but only for that given day. Because the next day he would be back from somewhere else and on fresh IP's.

By now (two months into this ordeal) I had a very large corpus of thousands of SPAM emails archived, which almost all were from that particular sender. So I rolled up my sleeves, dug into Perl and started parsing the SPAMs for any actionable patterns that would help to identify these SPAMs better. I extracted all IPs, domain names, email addresses, links from URLs in the message body and started tucking and nipping at them and looking for comonalities.

For the curious that wonder how SPF, DKIM or other Snake-Oil would help against this kind of SPAM? Wonder no more, I have the answer:

  • 60% of these SPAMs had valid SPF records.
  • >3% of these SPAMs had invalid SPF records.
  •  5% of these SPAMs had valid DKIM records.
  • >2% of these SPAMs had invalid DKIM records.

So DKIM and SPF are what they are: Snake-Oil. They only work if you believe in whatever positive effect you attribute to them in your own imagination.

Between the around 10.000 SPAMs the domains of the sender email addresses and the URLs of product pages advertised in the SPAMs was considerably smaller than the number of SPAMs in total. So this person was obviously operating from a limited number of domains. We're still talking of almost 1500 individual domains, though.

The common denominator? Almost every single one of them is registered with GoDaddy.com 

Well, this makes sense. Because if you've got so many domains and constantly (like every day) need to move them around? In that case you want an API that allows you to easily update the DNS for all your domains in one go, right? Also, if you're a cheap bastard, like things easy and love volume discounts, then having all eggs in one basket is a splendid idea.

Now how is this actionable and what can the AV-SPAM do to tackle this?

Let me put it this way: I had a little "Hold my beer!"-moment and went at it by extending Milter-GeoIP. It already examines emails during the connection stage and evaluates the headers as well so that we can see where an email is from, can run Geo-location on it, determine the recipient and can keep track of the volumes. I simply extended it to do two WHOIS lookups: One on the IP of the sending mail-server and one on the domain name extracted from the senders email address.

These two WHOIS records are then parsed and if the registrar is blacklisted (like GoDaddy.com!) it will reject the email at the MTA level if the WHOIS record of the domain(s) is too new, or has been updated within a given (recent) timeframe. In my example I'm looking at emails from GoDaddy.com domains (and someone else I'm not mentioning) and will reject emails from domains that are registered there and are newer (or have changes more recent than) seven days.

Here is a little grep on the maillog that just lists the rejected delivery attempts that were done based on the new WHOIS lookup integration of the AV-SPAM's new Milter-GeoIP:

All of these were rejected at the MTA level. The sending mail-servers were simply told: "No, thanks!" and then the door was slammed in their face, all based on a WHOIS-lookup that determined their domain(s) had too recent updates to their WHOIS records (or that the domains were too new in general).

With these measure in place I managed to reclaim my inbox. I'm not only back to the 2% of SPAMs that slipped through anyway. It's lower than that, because it also catched some fallout from other places than just this super-penetrant SPAM sender in particular.

Where to go from here?

Obviously: This feature is good enough that it will find its way into the baseline AV-SPAM during the next release. It still needs some fine tuning and a GUI to configure the new options, but that's just eye candy. The basic functionality is good.

Parsing of WHOIS records is messy, because not all present the relevant info in the same standardized fashion. However, the regular expressions that are in place now catch the creation/modification date of quite a few domain name registrars, so as is this already works with more than just a few registrars who provide easily parseable WHOIS records.

Additionally: The new Milter-GeoIP will not only allow to block emails based on GeoIP-location and WHOIS records, but I just extended the Geo-location feature to also take the blacklisted countries TLDs into account. This was done to stop SPAM from a penetrant Ukrainian server that's hosted with a German ISP. So GeoIP would say: "Hey, this email from an Ukrainian domain is German, which is whitelisted!" Well, no more of that nonsense. Now it *can* (and in my case: will) fire on either of these two indicators. If we block Ukraine, that means both Ukrainian IP's as well as their TLDs - period.

The AV-SPAM with the updated Milter-GeoIP (I think we need a new name for that now!) will soon be released and you can then test these new features yourself.

Do you have a particularly penetrant SPAM sender that's getting on your nerves? Let us know and we'll see how we can help you with that problem.


Return
General
Oct 13, 2017 Category: Development Posted by: mstauber
Previous page: API Documentation Next page: Downloads