Setting up a Bogofilter Filter

Bogofilter is a mail filter to identify spam. We currently have version 0.92.6 installed. If you want to use bogofilter to filter spam out of your EECG mail, follow the following steps:

Important Note: it's possible that bogofilter may sometimes flag legitimate mail as spam. From time to time, you should examine the mail that bogofilter saves in the caughtspam folder. You should never configure your mail filters to discard automatically mail detected as spam by bogofilter. See below for more details on what to do to deal with "false positives" and "false negatives".

Re-Training bogofilter: Inevitably, you'll get "false positives" and "false negatives". You should examine your caughtspam file from time to time to look for "false positives". "False negatives" will show up in your regular inbox. You can deal with both by periodically "re-training" bogofilter. Save the false negatives and positives into two separate files and do the training as follows from the unix command line:

The above helps to retrain bogofilter to better recognize what is legitimate email. In my experience, it will take a few weeks of diligently doing the above retraining to get good results.

Tuning Bogofilter: I found that I got better results if I tuned bogofilter to be a bit more aggressive than it would be by default. Of course, this required looking for "false positives" a bit more frequently at first. You can tune bogofilter by explicitly setting a few parameters for its operation in the file ~/.bogofilter.cf. The following contents for ~/.bogofilter.cf worked for me:

            db_cachesize=4
            robx=0.375672
            min_dev=0.375
            robs=0.1778
            spam_cutoff=0.500
            ham_cutoff=0.450

If you need clarifications, let me know. If you need more details, a man page is installed (man bogofilter), or you can look up the FAQ.