[funsec] 95% of User Generated Content is spam or malicious
drsollyp at drsolly.com
Sun Feb 14 16:11:36 CST 2010
I do a lot of stuff using regexs after the mail has got to me, but the
final despamming uses a weird old tip discovered by a mom. I sort the mail
alphabetically by subject.
On Sun, 14 Feb 2010, Rich Kulawiec wrote:
> On Wed, Feb 10, 2010 at 10:24:27PM -0500, Dave Paris wrote:
> > Where the trick (to the extent it's a trick, I suppose) lies here is
> > what it takes to knock down this volume.
> I use my firewalls/routers. Beginning with the DROP list, followed up
> by a large number of country, so-called ISP/webhost (spammer front,
> e.g., Eonix), so-called ESP (spammers, e.g., iContact, Uptilt) blocks.
> And I use passive OS recognition to treat anything that's running
> Windows differently -- because the the odds are in the 10e4-10e6
> to 1 range that it's not a real mail server, depending on how I
> construct the metric.
> Then I enforce DNS/rDNS existence and consistency checks on the connecting
> host and the HELO parameter. Then I use a large set of rDNS patterns
> that's been very carefully developed to match non-mail-sending
> hosts (e.g. end-user systems) and refuse everything from them outright:
> real mail systems have real names, not generic ones.
> Then I use a local blacklist of domains, sender LHS, senders, hosts,
> networks, etc. Then several DNSBLs. And a few other things.
> This is an extremely effective, efficient and very accurate setup.
> It's effective because it doesn't waste time trying to figure out if
> the same abusers who sent spam yesterday are sending some more today:
> of course they are. It's efficient because it rejects/accepts outright
> -- and when it rejects, it does so before seeing the message-body.
> It's accurate (a) because it's based solely on deterministic criteria
> (b) because I've been doing this for a very long time (sadly) and
> have learned a few things by now and (c) because it's correlated against
> my own mail logs, a necessary but seldom-performed step that helps me
> understand what the spam and non-spam components of my mail stream are.
> I've gone into considerably more detail about this on mailop, if you
> want the extended writeup, but the bottom line is that this kind of
> approach beats the pants off more complex/trendier ones in terms
> of performance, simplicity, resistance to attack, FP rate, FN rate,
> maintainability and predictability. But it does require pretty
> good knowledge of what *your* spam/not-spam mail mix looks like:
> you've got to understand it well before deploying something like this.
> Fun and Misc security discussion for OT posts.
> Note: funsec is a public and open mailing list.
More information about the funsec