Review of SpamBayes and
the SpamBayes Outlook Plug-In v0.81

SpamBayes is a Bayesian email spam filter, one of the new generation of spam filters which attempt to intelligently distinguish spam from legitimate email based on the characteristics and content of the email.

SpamBayes is also an Open Source project, which means you can't beat the value.  It's absolutely free!  Even if you wanted to pay for it, all you can do is donate to the Python Software Foundation, or contribute your skills if you're a programmer.  You can download SpamBayes from http://spambayes.sourceforge.net, and you can read all about the software and its background there too.

By itself, SpamBayes is generally not useful to 90% of the PC users.  OK, this is a contrived number actually, but by itself the product is a set of Python scripts and proxies for POP3, IMAP and Procmail.  There's a fair amount of setup effort involved, and submitting email to be categorized is not the most seamless operation.  It's not exactly user friendly to most folks.  But don't despair; that's where the Outlook Plug-In comes in!

The Outlook Plug-In for SpamBayes neatly packages all the required run-time components along with extensive hooks into Microsoft Outlook, and provides it in a simple installer that gets everything into place without drama and obeys Windows' usual Add/Remove Programs applet if you later wish to remove it.  Once installed, a new toolbar appears in your Microsoft Outlook with two or three context-sensitive buttons used for managing SpamBayes and for submitting emails to the categorizer.

Installation

Installation is a snap.  Most importantly, remember this product is for the full-fledged Outlook product, not Outlook Express.  There is a warning about making sure Word is closed if you use it as your email editor, but frankly it's a good idea to make sure all your programs are closed.  From there on it's just a matter of hitting the [Next] button a few times.

The first time you run Outlook after installing SpamBayes, a configuration wizard appears.  Unless you've been collecting junk email recently, most people will choose the first option - "I haven't prepared for SpamBayes at all."  The wizard will offer to create two folders, "Junk E-Mail" and "Junk Suspects".  You may choose your own folder names if you prefer.  These folders are used to store email that has been categorized by SpamBayes.

The installation will have created the desired program folder (C:\Program Files\SpamBayes Outlook Addin\ by default) and a folder for its database and configuration settings in your Windows user profile folder.  The installer creates no icons on your desktop or in your Programs menu.  The only evidence of its presence is its toolbar in Outlook and an entry in Windows' Add/Remove Programs control panel applet.

Usage

Although SpamBayes functions immediately based on some built-in assumptions, it really needs to be trained to be an effective, personalized tool.  Optionally you may also make adjustments to the balancing act it performs when categorizing your email.

Training amounts to collecting some "ham" (desirable, legitimate email) in one folder, collecting some "spam" (junk email) in another folder, and feeding it all to SpamBayes using the SpamBayes Manager tool.  Alternatively, you can train SpamBayes incrementally by using the [Delete As Spam] and [Recover From Spam] buttons on emails as they come in.

The training allows for some subjectivity in the classification of spam.  If you decide you really do want to see ads for penis enlargement, then by all means, you can!  And if you decide that all BCC'd email from your cousin Ralph is junk, SpamBayes can be trained to know that too.

Pro's & Cons

SpamBayes installs and uninstalls without fanfare, and appears to do no harm.  If you over-train SpamBayes, it will slow down as the database becomes bloated.  Although you can't incrementally un-train it or pack the database, it's easy enough to start again from scratch.

Outlook 200 or later is pretty much all this plug-in needs.  It does not matter whether you use POP or IMAP or connect to an Exchange server.

Outlook still makes it usual noises and plants the envelope icon in your Windows System Tray even when SpamBayes has classified an incoming email as junk and moved it to the specified folder.  So for those of you whose workflow revolves around your inbox activity, errant disruptions will continue.

There's no evidence of SpamBayes' database being designed for multiuser access.  So if you use Outlook from more than one place, you may have to train each installation individually, or share out the database folder, or come up with a creative way to replicate the database.

SpamBayes won't delete your email.  But that's probably a good thing since it's not foolproof.  If you're hell-bent on letting your email quietly disappear you can design a rule to do it, but I don't think it's wise.

SpamBayes not work very well with languages other than English.

Effectiveness

I installed SpamBayes both at home and at the office.  A fairly large amount of spam comes to both email addresses, due mostly to my foolishly naive use of those email addresses on Usenet many years ago.  The big difference between the two is that at home, my email comes into an Exchange server running Vamsoft's Open Relay Filter, which checks email against six different DNSBLs before letting it through, but at work the email is subjected to a crude conglomeration of manually maintained IP, domain and text filters, quietly discarding anything it doesn't like.

As first lines of defense go, the DNSBLs are truly the way to go.  I receive ten to twenty times the spam at work that I receive at home, after the respective filtering.  The interesting part here may actually boil down to the characteristics of the email that makes it through that first line of defense.

After two months of training, nearly all the spam coming to my personal email address is properly categorized.  Yet 35% of the spam coming into the office email address still lands in the "suspected spam" folder and every once in a while I still see legitimate email landing in the "spam" or "suspected spam" folders.  The training database at the office keeps getting fatter but accuracy does not seem to be improving.

One would think that with at least 10 times the volume of spam, the office setup would be more accurately trained than at home.  But - and this is truly just an assumption - it seems that the text filters used there as a front-line defense may actually be hobbling SpamBayes' categorizer by not being able to train on some of the more conspicuous, easily identifiable spam.

Or perhaps the distinction between work -related email and absolute garbage is too vague!

Conclusion

Bayesian filtering is still in its relative youth.  Likewise, SpamBayes makes no pretense to be a mature product.  It is still in flux, the authors experimenting with various theories and watching how the spammers respond.  But for a product with a fractional version number it seems to be as good or better than many shareware and commercially available competitors.

As with any anti-spam measures, one needs to keep a sharp eye not to lose legitimate email.  SpamBayes does a terrific job of catching the few emails that make it through my gauntlet of DNSBL checks at home, and has turned my office inbox from an unusable littered wasteland back to a genuinely useful work tool.


Entire contents Copyright (C) 1994-2015 Brad Berson and Bytebrothers Internet ServicesAnim Plug
Page updated February 12, 2009.  See Terms and Conditions of use!