Skip to main content

How to Integrate Bayesian classifier in Spamassassin on CentOS Web Panel ?

The Bayesian classifier in Spamassassin tries to identify spam by looking at what are called tokens; words or short character sequences that are commonly found in spam or ham. If I've handed 100 messages to sa-learn that have the phrase penis enlargement and told it that those are all spam, when the 101st message comes in with the words penis and enlargment, the Bayesian classifier will be pretty sure that the new message is spam and will increase the spam score of that message.
In order for SpamAssassin to be accurate, you must train it on your specific mail patterns. SpamAssassin has a Bayesian classifier that can be used to help refine the classification of spam mail. The sa-learn interface allows you to train SpamAssassin to recognize good mail and junk mail.
You need to train with both spam and ham mails. One type of mail alone will not have any effect.

To filter for spam:
  • Save spam into a new mail folder called Spam
  • Save non-spam (ham) into a new folder called Ham. You may also put messages that were marked as spam by mistake into this folder.

If you're having trouble with Bayes, see BayesFaq for help.
Spamassassin Default configuration:
# vi /etc/mail/spamassassin/local.cf 
required_hits 5
report_safe 0
#required_score 5
rewrite_header Subject [SPAM]
Explanation:
Required_hits 5 is for small mail serv. You can increase it more if you need
Report_safe is on 0 here, but you can change to 1 or 2 ( see spamassassin guides )
Rewrite_header we call it ***SPAM*** here, you can call it as you want
Required_score can set form 0 -> 5 depends what you need

Create the Bayes Directory
# cd /etc/mail/spamassassin/
# mkdir bayes
# chmod g+rws bayes/
# chmod -R 775 bayes/ [if needed]
Configure the Bayesian classifier in Spamassassin:
# vi /etc/mail/spamassassin/local.cf 
 use_bayes 1
 use_bayes_rules 1
 bayes_auto_learn 1
 bayes_path /etc/mail/spamassassin/bayes/
 bayes_file_mode 0660

 bayes_ignore_header X-Bogosity
 bayes_ignore_header X-Spam-Flag
 bayes_ignore_header X-Spam-Status
 include /usr/share/spamassassin/
Restart & Update Spamassassin rules
# systemctl restart spamassassin
# sa-update
Check Spamassassin status
# systemctl status spamassassin
If you get the above error that marked in snapshot , just disable the line ( bayes_path) in /etc/mail/spamassassin/local.cf , It will use the default bayes_path .
#bayes_path /etc/mail/spamassassin/bayes/
Just create new bayes databases
# sa-learn --sync
# sa-learn -p /etc/mail/spamassassin/local.conf --sync
If you followed my series tutorial, then above configuration looks like below:
Copy the existing bayes databases to Spamassassin Bayes directory:
# cp /root/.spamassassin/bayes_* /etc/mail/spamassassin/bayes/
More Command (Optional)
# sa-learn -p /etc/mail/spamassassin/local.conf --rebuild --force-expire
# sa-learn -p /etc/mail/spamassassin/local.conf --sync --force-expire
Test to see bayes work
# spamassassin -D --lint
# spamassassin -D -p /etc/mail/spamassassin/local.conf --lint
Troubleshooting 
# sa-learn --dump magic 
# ss -tnlp | grep spamd
For more information, refer to the links below:
http://spamassassin.sourceforge.net/doc/sa-learn.html
https://sourceforge.net/p/webadmin/mailman/webadmin-list/thread/1088762157.18480@fudu.home/

Comments