bmf: Bayesian Mail Filter
March 29, 2008
I’ve been using bmf (Bayesian Mail Filter) lately to filter spam. Having previously used SpamAssassin, it’s obvious that bmf is a much smaller and more focused application. It’s written in C so it’s very fast and it’s also very easy to operate.
To give bmf some initial training take two mbox files, one with spam and one with ham, and run the following:
cat spam.mbox | bmf -S
cat ham.mbox | bmf -N
In the future if bmf makes a mistake just pipe the incorrectly flagged
message to bmf -N
or bmf -S
as necessary.
Next, bmf needs to process each message you receive. This is usually
done using a .forward
file to call procmail and a .procmailrc
file
which calls bmf. For example, here is my .forward
file at SDF:
"|IFS=' '&&exec /usr/pkg/bin/procmail -f-||exit 75 #jrblevin"
If you have an SDF account, running nospam -e
should set this up
for you (but it will also set up other procmail rules which you
might not want if you’re going to use bmf).
Now, once procmail gets the message, it should send it to bmf. Here’s
my .procmailrc
:
:0fw
| /usr/pkg/bin/bmf -p
:0:
* ^X-Spam-Status: Yes
/arpa/gm/j/jrblevin/mail/spam
This says to pipe the message to bmf and if bmf declares that it’s spam
(by setting the X-Spam-Status header), then save the message to
~/mail/spam
. Adjust the bmf and mailbox paths accordingly.
If you use mutt, here are the keyboard shortcuts I use:
# Classify mail as spam or ham
macro index S "| bmf -S\n<save-message>=spam\n" "SPAM"
macro pager S "| bmf -S\n<save-message>=spam\n" "SPAM"
macro index H "| bmf -N\n" "HAM"
macro pager H "| bmf -N\n" "HAM"
Pressing S
pipes tells bmf that the selected message is spam and moves
it to the spam
folder. Pressing H
classifies the message as ham.