eWeek Rates Mozilla's Bayesian Spam Filtering
Wednesday October 29th, 2003
Yacoubean writes: "eWeek reviewed spam filtering with various tools, including Mozilla 1.5 and Outlook 2003. Their test showed that Bayesian filtering (used in Mozilla) is more effective at fending off spam. Outlook 2003 and other products use the whitelist/blacklist approach, along with proprietary technology. The eWeek tests showed that these techniques flagged a lot of false positives, while Mozilla rarely flagged legitimate email. eWeek also points out, on the down side, that Bayesian filtering requires a lot of initial work from the user to train the filter, however 'the investment can pay off.'"
#1 is it working in the long run?
Thursday October 30th, 2003 9:26 AM
Somehow the filter performance has diminished in the last few months. I adopted it as soon as it came out and early on it looked quite promising but recently I keep getting false negatives (i.e. spam that thunderbird thinks it's not spam). Some of it is that spammers are getting s mar ter. But I can't help noticing a somewhat poor performance. Also, time lag from download to display looks like it's increasing, and I wonder if it's the filter taking more and more time to do its job.
#2 They don't need to pay for it....
Thursday October 30th, 2003 11:05 AM
I could never understand the motivations of spammers to do this sort of thing. Doesn't it become clear that people who filter email are NOT interested in buying herbal viagra, etc?
#3 Spams changing identity
Thursday October 30th, 2003 11:34 AM
Spammers are adapting their techniques, notice the jibberish at the top of alot of spam now a days. My (rough) understanding is that mozilla filters work by weighing the value of each word in an e-mail has towards determining if it's spam, by adding 3 lines of jibberish to a spam e-mail the filter gives a lower spam score because it has never seen that "word" and thus gives it the beneifit of the doubt.
An interesting idea I read about was creating filters that process the content of the websites that are linked in the e-mail as well, the e-mail might only say "I was thinking about you when i saw this site" and be left alone by the filter but if mozilla reached out and checked the site it would see the words "viagria", "cheap", and "for sale", adding that score to the original e-mail and the message gets flagged.
Spammers are shifting away from advertising their products in the actual spam message and moving more towards mimmicking legitament e-mail instead, tricking you into clicking the link is their number one main goal, the website is now the first line of salesmenship and is secondary.
#5 Re: Spams changing identity
Monday November 3rd, 2003 8:10 AM
Okay. So why not spell check each of the first 25-30 words in the message. If no word is spelled right, it's spam.
#6 Re: Re: Spams changing identity
Tuesday November 4th, 2003 9:22 AM
What if the message is not in English but is perfectly legit? Do you spell-check in every language? :)
#8 Re: Spams changing identity
Wednesday November 5th, 2003 5:23 PM
"What if the message is not in English but is perfectly legit?"
We're Americans! Another language might as well be treated as spam!
#9 Re: Re: Spams changing identity
Thursday November 6th, 2003 9:44 AM
We have an official language now? Spanish has finally been adopted? :)
(Although, I do admit that there is some truth to that statement. I mean, even if a message is legit in another language, if you don't know the language in question [and have no interpreter], what use is it to you?)
#4 Anti-spam strategy article highlights Mozilla
Thursday October 30th, 2003 11:48 AM
I gave a generally favorable review to Mozilla Thunderbird -- without, admittedly, mentioning the Bayesian stuff -- in this article, published yesterday, which is mostly being read by trade unionists (the target audience):
It has already been commented on by half a dozen trade union activists and read by some 500 more, so maybe some of these people will soon discover the joys of Mozilla.
#7 Great job but could do better
Tuesday November 4th, 2003 10:03 AM
It works well as far as it goes but would be better if they implemented http://bugzilla.mozilla.org/show_bug.cgi?id=181534 (Update Junk mail filtering to use latest Bayesian techniques from spambayes.sf.net), so please vote for this bug if you haven't already.