AGAVA Antispam is trainable. Some users avoid trainable spam filters, because they believe that training is a long and tiresome process, which leads to filtering based on obscure and fuzzy criteria. These users prefer filters based on blacklists such as SpamCop, ORDB, DSBL, SPEWS, etc.
Myth One. A blacklist is an objective criterion. Let us describe to you in a few words how a typical blacklist operates. Say, there’s a web hosting company that sells email services. And say, there’s this client, who decides one day to send out a bunch of spam messages to a lot of addresses. Some of these might get submitted to a blacklist as spam. The IP address of the web host(usually even a group of IP addresses) gets blacklisted. As a result, you stop receiving legitimate email from the network neighborhood.
You might say: bad web hosting company, bad clients. Well, that might be true, but it's not the point. While the bad web hosting company investigates the problem of one bad client, a whole bunch of good and innocent clients would suffer. Is that objective enough? We don’t think so.
Myth Two. Statistics-based criterion is not objective. Roughly, a statistics-based filter analyses each word in an email and compares it to the entry of the same word in its database, where there’s a spam coefficient for it. The word-based approach has had a number of poor implementations, which might have spoiled the perception of this method. These filters came with a number of pre-defined words/coefficients and raised the spam probability of a message by simply incrementing a spam score of the message. The items that increased the spam score could be non-white HTML backgrounds or such words as “free” or “spam”. These filters were known as “newsletter killers”. Newsletter publishers went mad and started using tricks to fool these filters by typing “sp*m” or “fre*e”. Well, that is subjective, but there’s a great difference in how Antispam works.
First, the formula that calculates the total spam score is not that simple (not linear). It does not simply add up the coefficients or increase a spam score, but deploys other criteria, such as the overall message size — if there’s a dozen spam words in a long email about cooking, Antispam will not mark it as spam.
Second, Antispam is trainable. You cannot tune the coefficients manually, you can only submit a full message for training. This ensures objectivity by excluding human error and also means that your statistical coefficients are based on your personal email.
Both of the above ensure that Antispam’s approach is objective and is capable of providing for good results.
|