Skip to main content

How to Integrate Bayesian classifier in Spamassassin on CentOS Web Panel ?

The Bayesian classifier in Spamassassin tries to identify spam by looking at what are called tokens; words or short character sequences that are commonly found in spam or ham. If I've handed 100 messages to sa-learn that have the phrase penis enlargement and told it that those are all spam, when the 101st message comes in with the words penis and enlargment, the Bayesian classifier will be pretty sure that the new message is spam and will increase the spam score of that message.
Bayesian classifier

In order for SpamAssassin to be accurate, you must train it on your specific mail patterns. SpamAssassin has a Bayesian classifier that can be used to help refine the classification of spam mail. The sa-learn interface allows you to train SpamAssassin to recognize good mail and junk mail.
You need to train with both spam and ham mails. One type of mail alone will not have any effect.

To filter for spam:
  • Save spam into a new mail folder called Spam
  • Save non-spam (ham) into a new folder called Ham. You may also put messages that were marked as spam by mistake into this folder.
If you're having trouble with Bayes, see BayesFaq for help. SpamAssassin Default configuration:
# vi /etc/mail/spamassassin/local.cf 
required_hits 5
report_safe 0
#required_score 5
rewrite_header Subject [SPAM]
Explanation:
Required_hits 5 is for small mail serv. You can increase it more if you need
Report_safe is on 0 here, but you can change to 1 or 2 ( see spamassassin guides )
Rewrite_header we call it ***SPAM*** here, you can call it as you want
Required_score can set form 0 -> 5 depends what you need
Create the Bayes Directory
# cd /etc/mail/spamassassin/
# mkdir bayes
# chmod g+rws bayes/
# chmod -R 775 bayes/ 

# chown -R root:nobody bayes
# chmod -R 777 bayes [if needed]
Here Actually I am using 777 permission recursively on bayes directory with mentioned user & group. It's working perfectly with MailScanner. Configure the Bayesian classifier in Spamassassin:
# vi /etc/mail/spamassassin/local.cf 
 use_bayes 1
 use_bayes_rules 1
 bayes_auto_learn 1
 bayes_path /etc/mail/spamassassin/bayes/
 bayes_file_mode 0660

 bayes_ignore_header X-Bogosity
 bayes_ignore_header X-Spam-Flag
 bayes_ignore_header X-Spam-Status
 include /usr/share/spamassassin/
Restart & Update Spamassassin rules
# systemctl restart spamassassin
# sa-update
Check Spamassassin status
# systemctl status spamassassin
If you get the above error that marked in snapshot , just disable the line ( bayes_path) in /etc/mail/spamassassin/local.cf , It will use the default bayes_path .
#bayes_path /etc/mail/spamassassin/bayes/
Just create new bayes databases
# sa-learn --sync
# sa-learn -p /etc/mail/spamassassin/local.conf --sync
The details configuration looks like below:
required_hits 5
report_safe 0
rewrite_header Subject [SPAM]

# DCC
use_dcc 1
#dcc_home /etc/dcc
dcc_home /var/dcc
dcc_path /usr/bin/dccproc
#dcc_path /usr/local/bin/dccproc
dcc_timeout     10
add_header all  DCC _DCCB_: _DCCR_
score DCC_CHECK 4.000


# Pyzor
use_pyzor 1
pyzor_options --homedir /etc/mail/spamassassin/.pyzor
pyzor_path /usr/bin/pyzor
add_header all Pyzor _PYZOR_
score PYZOR_CHECK 3.000

# Razor
use_razor2 1
razor_config /etc/mail/spamassassin/.razor/razor-agent.conf
score RAZOR2_CHECK 3.000

# Bayes
use_bayes 1
use_bayes_rules 1
bayes_auto_learn 1
#bayes_path /etc/mail/spamassassin/bayes/
bayes_file_mode 0660

bayes_ignore_header X-Bogosity
bayes_ignore_header X-Spam-Flag
bayes_ignore_header X-Spam-Status
include /usr/share/spamassassin/ 
Use bayes_file_mode 0665 If you use MailWatch with MailScanner. Copy the existing bayes databases to Spamassassin Bayes directory:
# cp /root/.spamassassin/bayes_* /etc/mail/spamassassin/bayes/
More Command (Optional)
# sa-learn -p /etc/mail/spamassassin/local.conf --rebuild --force-expire
# sa-learn -p /etc/mail/spamassassin/local.conf --sync --force-expire
Test to see bayes work
# spamassassin -D --lint
# spamassassin -D -p /etc/mail/spamassassin/local.conf --lint
Troubleshooting 
# sa-learn --dump magic 
# spamassassin --lint
# sa-compile
# ss -tnlp | grep spamd
# sa-learn --sync -D
# sa-update -v
For More Command : https://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html
For more information, refer to the links below:
http://spamassassin.sourceforge.net/doc/sa-learn.html
https://sourceforge.net/p/webadmin/mailman/webadmin-list/thread/1088762157.18480@fudu.home/

Comments