This page is part of the EmailServer article.
Spamassassin is a very versatile and complete spam fighting solution. It uses statistical techniques as well as external blacklist and can be configured to use add-on tools to make its detection more refined.
SpamAssassin uses a rating system whereby each email goes through a list of tests and the mail is flagged for every positive test, increasing the number of spam points it is allocated. Each test allocates a variable number points or fraction of a points depending on how useful and reliable it is at detecting spam.
Once the email has gone through all the tests have been performed an action is taken based on the total number of points: if the score is high enough, we're sure that this is spam, if not, we can let the message through.
To install SpamAssassin, just use
# yum -t install spamassassin
The last line tells your server which local network it can trust. This should be set to the IP range of your internal network.
Now, make sure SpamAssassin will run when we boot:
# chkconfig --levels 235 spamassassin on
# service spamassassin start
Initialise the Bayesian database:
# sa-learn --sync
Test our config by running:
# amavisd debug-sa
If what you did above was done properly, you should see debug: using "/var/amavis/.spamassassin/user_prefs" for user prefs file in the middle of all those spewed by Amavisd-New (scroll back or use Shift+PageUp keys).
Just send an email with the following in the body:
This is a standard fake spam signature used to test antispam software.
Jan 23 15:23:12 white amavis: (28345-01) Blocked SPAM, MYNETS LOCAL [192.168.0.101] [192.168.0.101] <firstname.lastname@example.org> -> <email@example.com>, Message-ID: <4979704E.firstname.lastname@example.org>, mail_id: RKeqXrbI1RJJ, Hits: 998.56, size: 649, 273 ms
The email should be marked with such a high spam score that it will never reach its destination and it gets discarded.
Spamassassin includes a powerful statistical analysis that can help toward refining the score given to emails passing through it.
The only drawback with baysian analysis is that it needs a set of good (ham) and bad (spam) emails large enough to be accurate, and you need to sort these emails and manually train spamassassin.
To avoid any aggravation, I created a simple MissedSpam folder in one of the IMAP mail accounts that I use. I then simply have to move any spam that made its way to an inbox into that folder.
To train spamassassin as to what is spam and what is ham, make sure you have enough segregated emails (between 150-3000) in each mailbox being trained then issue the following:
# sa-learn --spam --sync /mail/postmaster/.MissedSpam/cur/
That would train spamassassin to recognise spam better.
# sa-learn --ham --sync /mail/emily/cur/
# sa-learn --ham --sync /mail/john/cur/
Make sure that the database ownership has not been reclaimed by
# chown amavis.amavis -R /var/amavis