New techniques of spamming…新技术的垃圾邮件…
For quite sometime naive bayesian classifier based SPAMBayes filtered my emails very accurately with very few false positives.在相当一段时间的朴素贝叶斯分类基于spambayes过滤我的电子邮件非常准确,很少假阳性。
Recently however I have noticed few trends in spamming which are alarming in nature.最近,但我注意到数的趋势,垃圾邮件这是令人震惊的性质。
- Database poisoning: Using otherwise innocuous words (ham words) in a SPAM, thereby effectively poisoning the database in the long run数据库中毒:使用无害的,否则的话(火腿字)在一个垃圾邮件,从而有效地中毒数据库,在长远而言
- Junk Tags: Hiding spam words by inserting invalid HTML tags in between words.垃圾标签:隐藏垃圾邮件的话,插入无效的HTML标记,在字与字之间。 Any HTML parser ignores tags it doesn’t understand, thereby resulting in properly viewable document任何HTML解析器忽略了标记,它不明白,因而在适当的检视文件
- Invalid Words: Spam word like mortgage etc. are masked by inserting special characters or junk characters in between.无效的话:垃圾邮件字一样,按揭等都是蒙面插入特殊字符或垃圾的人物之间。
Solutions I could think of:解决方案,我可以认为:
- Most of the database poisoning email tend to be classified in Not Sure category.大部分的数据库中毒的电子邮件,往往被归类在不能确定类别。 I suggest that you delete them instead of classifying them as spam.我建议你删除他们,而不是归类为垃圾邮件。 However it still requires that we spend some time for it which is what I don’t like.不过,我们仍然需要我们花一些时间,因为这是我不喜欢。
- Junk Tags: Add a filter in front of bayesian classifier to eliminate junk tags垃圾标签:添加过滤器在前面的贝叶斯分类,以消除垃圾标签
- Invalid Words: No-exact matching algorithms from Lucene etc. should help.无效的话:不完全匹配算法从Lucene的等,应有助。
I have recently noticed a significant increase in mortgage spams.我最近注意到一个显着增加,在按揭垃圾邮件。 It should be easy to tackle them by legal means.它应该可以很容易地解决这些问题通过法律手段。
Overall the game is becoming tougher for spam prevention.整体游戏正在成为更严厉的垃圾邮件预防。 A combination of existing techniques are required for any spam filters to remain effective.结合现有的技术所需要的任何垃圾邮件过滤器,以维持有效的。
Looking forward to hear your thoughts.期待着听到您的想法。
Filed under提起下 Spam Watch垃圾邮件观赏 , , Web网页 | |
| |
RSS 2.0 2.0 | |
Email this Article电子邮件此文章
You may also like to read您也可以想读 |




July 29th, 2004 at 2:49 am 2004年7月29日在上午02时49分
I have tried all the software solutions to twarting spam.我曾尝试所有的软件解决方案,以twarting垃圾邮件。 I have yet to see one that works as good as simply owning a domain and creating many email addresses.我还没有看到一个工程一样好简单,拥有一个域和创造许多的电子邮件地址。 One for each site I visit.一为每个网站我访问。 Like the one I used here.像一个我用在这里。 If I start getting spam from that address, I simply forward it to如果我可以开始接收垃圾邮件从该地址,我只转发给 null@null.net and that’s that.这就是这一点。 I have about 30 email addresses generating well over 250 spams a day.我有大约30的电子邮件地址生成,以及超过250个垃圾邮件1天。 They are all being forwarded to他们都被转交给 null@null.net (Sure hope no one ever gets that address). (当然希望没有人获得该地址) 。
I *NEVER* give out my main email address to anyone!我* *从来没有给我的主要电邮地址给任何人! All the non spam addresses get forwarded to my real email account so I can read them and respond to them.所有非垃圾邮件地址,获得转交我真正的电子邮件帐户,所以我可以阅读和回应。 Sure, at that point my real address get’s sent out.当然,在这一点上我的真实地址得到的发出。 However, it’s not accidently published on the web.不过,这不是意外在网站上公布。 At least not by posting it on a blog or a web store.至少不张贴对一个博客或网上商店。
October 15th, 2004 at 5:31 am 2004年10月15日在上午05时31分
I facing the same problem. i面临同样的问题。 The new genre of spam that I noticed was that a bunch of unrelated words were pushed in at the end of the e-mail.新的游戏类型的垃圾邮件,我看到的是一群无关的话,推在上月底的电子邮箱。 These words are really rare words gathered from different contexts.这句话真的是罕见的话聚集来自不同背景。
Do you have any suggestions for it?请问你对此有何建议呢?