June Spam

Jul. 26th, 2004 08:51 pm
womzilla: (Default)
[personal profile] womzilla
For the last few years, I've chosen one month to be my spam survey month. This year, I shifted from July to June because I had hoped to use my data to train a new spam-filtering system by, say, July 4th, but I haven't done so yet.

Anyway, here are the numbers:

Total e-mails received on my Panix account, June 2004: 8088

Pieces automatically tagged as spam by Spamassassin: 4575
Other pieces of spam or e-mail malicious code: 750
Mail from one or another of my high-volume mailing lists: 1153

Number of e-mails actually directed more or less at me: 1610. This includes some lower-volume mailing lists, "acceptable" advertisements (e.g., monthly updates from the Quality Paperback Club), and other mail not actually written specifically for me.

That means that on a typical day, I got 79 pieces of unfiltered e-mail in my mailbox, of which one-third (25 a day) were spam or maliceware. In addition, I got twice that many pieces of spam which I never had to look at. Nearly 15% of the spam I received was not caught by Spamassassin. That's a much lower success rate for Spamassassin than last year, when it caught more then 95% of the spam and maliceware.

Date: 2004-07-26 05:54 pm (UTC)
From: [identity profile] montoya.livejournal.com
SA on Panix has gone noticeably downhill (and since it's the latest release, I assume it's a general SA problem), but if you set the scores lower (call it spam at 3.8 or so, say), it helps significantly.

Date: 2004-07-26 06:23 pm (UTC)
From: [identity profile] womzilla.livejournal.com
That would have helped some, but not as much as I would like. Here are some more numbers about my certified spam:



So if I set my spam threshold at 4.0, I still get 360 pieces of untagged spam. That's a 6% failure rate, which is close to where it was last year. Dropping it to 3 points brings me to a 4% failure rate with no false positives; I think that's the way to go.

Date: 2004-07-26 06:33 pm (UTC)
From: [identity profile] montoya.livejournal.com
It's still not great, don't get me wrong; I'm not sure there's any way to get around that, with spammers actively targetting anti-spam mechanisms. That said, when you're considering changes to the spam cutoff point, it's worth thinking of it in relative, rather than absolute terms. Sure, filtering out another 300 of your 5284 spams isn't a huge difference, but cutting inbox-reaching spam from 660 to 360 is a huge improvement.

Ironically enough, my main time sink with spam is now glancing through my spambox to make sure I don't have any false positives, so I've ended up filtering email with 7+ scores into a super-spam folder that I barely even glance at before deleting (it's basically /dev/null with a bit of fault tolerance) just to keep the regular spambox manageable.

Profile

womzilla: (Default)
womzilla

March 2016

S M T W T F S
  12345
6789101112
13141516171819
202122232425 26
2728293031  

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated May. 8th, 2026 04:56 pm
Powered by Dreamwidth Studios