You might also consider reading Last of the scanned pictures or "Goats eat pockets!".
Bad spam gets it
We all likely get a fair amount of unsolicited email (spam) these days, and often I hear people complaining — wishing that there was something that could be done to keep it from getting into their inboxes. Well, there is something that can be done (if you have some control over your email servers) and I’ll document our setup here.
Our email accounts are hosted at a server down in Texas in a place called Rackspace. I share my server with a bunch of other people who all rely on a guy named Michal to make sure things keep ticking. But even though I share the server, my email is processed specifically for my domains and that gives me some say over things. I exercise that “say” through the use of two things: procmail and SpamAssassin.
First, procmail. Procmail is a program on the server that can process my email in a variety of ways. I use it to hand off the incoming email to a secondary system and then either deliver it to me or through it out based on what that system says. The procmail settings are found in a procmailrc text file in the root of my account. Here is a copy of the ~./procmailrc file:
:0fw
| spamc
## to delete spam with a score higher than 10:
:0
* ^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*
/dev/null
What it says is: forward the email to our spam cop and when it gets back check the spam-level. If the level is 10 or higher, send it to the trash. Otherwise deliver it.
The spam cop, our secondary system named SpamAssassin, gives each email a rating that corresponds to the likelihood of that email being spam. Low numbers on the scale relate to regular email and high numbers relate to spam. The settings for SpamAssassin are held in the ~/.spamassassin/user_prefs file, and ours looks like this:
# SpamAssassin config file for version 2.5x
# How many hits before a message is considered spam.
required_hits 5.0
# Whether to change the subject of suspected spam
rewrite_subject 0
# Text to prepend to subject if rewrite_subject is used
subject_tag *****SPAM*****
# Encapsulate spam in an attachment
report_safe 1
# Use terse version of the spam report
use_terse_report 0
# Enable the Bayes system
use_bayes 1
# Enable Bayes auto-learning
auto_learn 1
# Enable or disable network checks
skip_rbl_checks 0
use_razor2 1
use_dcc 1
use_pyzor 1
# Mail using languages used in these country codes will not be marked
# as being possibly spam in a foreign language.
# - english
ok_languages en
# Mail using locales used in these country codes will not be marked
# as being possibly spam in a foreign language.
ok_locales en
The syntax of this file is not something I know by heart — I created it using the form at SpamAssassin Configuration Generator Likewise, I’m probably not the best person in the world to discuss the different options, but I think they are fairly self-explanatory. What you might notice in my combination setup is that any email that receives more than 5 hits, but less than 10 hits will still come through the system to my inbox. The reason for this is that I’m still unsure of where my cutoff should be (it is kind of a personal thing). So, by doing this I can look at those ‘tweeners and see if they are actually spam. And in every case so far, they have been; no false positives. When they come through, the spam email is an attachment, and the email looks like this — from a recent email advert for low mortgage rates:
Spam detection software, running on the system "vanadium.sabren.com", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or block similar future email. If you have any questions, see the administrator of that system for details.
Content preview: ------=_NextPart_001_0001_C56DC178.DBAF4E13
Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding:
7bit Let the banks compete for your mortgage or loan! [...]
Content analysis details: (16.2 points, 5.0 required)
pts rule name description
---- ---------------------- --------------------------------------------------
1.3 NO_OBLIGATION BODY: There is no obligation
0.0 HTML_MESSAGE BODY: HTML included in message
0.5 HTML_20_30 BODY: Message is 20% to 30% HTML
0.1 HTML_TAG_EXISTS_TBODY BODY: HTML has "tbody" tag
3.3 MSGID_FROM_MTA_SHORT Message-Id was added by a relay
1.1 RCVD_IN_NJABL_PROXY RBL: NJABL: sender is an open proxy
[203.240.147.203 listed in dnsbl.njabl.org]
0.1 RCVD_IN_SORBS RBL: SORBS: sender is listed in SORBS
[203.240.147.203 listed in dnsbl.sorbs.net]
0.1 RCVD_IN_NJABL RBL: Received via a relay in dnsbl.njabl.org
[203.240.147.203 listed in dnsbl.njabl.org]
1.1 RCVD_IN_SORBS_SOCKS RBL: SORBS: sender is open SOCKS proxy server
[203.240.147.203 listed in dnsbl.sorbs.net]
1.1 RCVD_IN_DSBL RBL: Received via a relay in list.dsbl.org
[<http://dsbl.org/listing?ip=203.240.147.203>]
2.2 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in bl.spamcop.net
[Blocked - see <http://www.spamcop.net/bl.shtml?203.240.147.203>]
2.5 RCVD_IN_DYNABLOCK RBL: Sent directly from dynamic IP address
[203.240.147.203 listed in dnsbl.sorbs.net]
0.1 RCVD_IN_RFCI RBL: Sent via a relay in ipwhois.rfc-ignorant.org
[203.240.147.203 has inaccurate or missing WHOIS]
[data at the RIR]
1.1 FORGED_OUTLOOK_TAGS Outlook can't send HTML in this format
1.6 FORGED_MUA_OUTLOOK Forged mail pretending to be from MS Outlook
The original message was not completely plain text, and may be unsafe to open with some email clients; in particular, it may contain a virus, or confirm that your address can receive spam. If you wish to view it, it may be safer to save it to a file and open it with an editor.
I think it is interesting to see what tips the system off to spam, which you can see. One thing that you don’t see are added headers, like the one that says X-Spam-Flag: YES. I set up a rule in Entourage that automatically tags this as junk and files it in a junk folder, that way I can check on them later and see if we have any false positives, and they don’t sit in my inbox. Jill has the same setup. Actually, she has an even more complete set of rules that automatically sort mailing-list adverts (Eddie Bauer, Old Navy, etc.) into another folder. These are emails that she wants to read when she has time, but ones that have no utility taking up space in her inbox.
So, that’s how we deal with spam… what do you do?
written by Kevin in web stuff