New techniques of spamming…新技術的垃圾郵件…
For quite sometime naive bayesian classifier based SPAMBayes filtered my emails very accurately with very few false positives.在相當一段時間的樸素貝葉斯分類基於spambayes過濾我的電子郵件非常準確,很少假陽性。
Recently however I have noticed few trends in spamming which are alarming in nature.最近,但我注意到數的趨勢,垃圾郵件這是令人震驚的性質。
- Database poisoning: Using otherwise innocuous words (ham words) in a SPAM, thereby effectively poisoning the database in the long run數據庫中毒:使用無害的,否則的話(火腿字)在一個垃圾郵件,從而有效地中毒數據庫,在長遠而言
- Junk Tags: Hiding spam words by inserting invalid HTML tags in between words.垃圾標籤:隱藏垃圾郵件的話,插入無效的HTML標記,在字與字之間。 Any HTML parser ignores tags it doesn’t understand, thereby resulting in properly viewable document任何HTML解析器忽略了標記,它不明白,因而在適當的檢視文件
- Invalid Words: Spam word like mortgage etc. are masked by inserting special characters or junk characters in between.無效的話:垃圾郵件字一樣,按揭等都是蒙面插入特殊字符或垃圾的人物之間。
Solutions I could think of:解決方案,我可以認為:
- Most of the database poisoning email tend to be classified in Not Sure category.大部分的數據庫中毒的電子郵件,往往被歸類在不能確定類別。 I suggest that you delete them instead of classifying them as spam.我建議你刪除他們,而不是歸類為垃圾郵件。 However it still requires that we spend some time for it which is what I don’t like.不過,我們仍然需要我們花一些時間,因為這是我不喜歡。
- Junk Tags: Add a filter in front of bayesian classifier to eliminate junk tags垃圾標籤:添加過濾器在前面的貝葉斯分類,以消除垃圾標籤
- Invalid Words: No-exact matching algorithms from Lucene etc. should help.無效的話:不完全匹配算法從Lucene的等,應有助。
I have recently noticed a significant increase in mortgage spams.我最近注意到一個顯著增加,在按揭垃圾郵件。 It should be easy to tackle them by legal means.它應該可以很容易地解決這些問題通過法律手段。
Overall the game is becoming tougher for spam prevention.整體遊戲正在成為更嚴厲的垃圾郵件預防。 A combination of existing techniques are required for any spam filters to remain effective.結合現有的技術所需要的任何垃圾郵件過濾器,以維持有效的。
Looking forward to hear your thoughts.期待著聽到您的想法。
Filed under提起下 Spam Watch垃圾郵件觀賞 , , Web網頁 | |
| |
RSS 2.0 2.0 | |
Email this Article電子郵件此文章
You may also like to read您也可以想讀 |





July 29th, 2004 at 2:49 am 2004年7月29日在上午02時49分
I have tried all the software solutions to twarting spam.我曾嘗試所有的軟件解決方案,以twarting垃圾郵件。 I have yet to see one that works as good as simply owning a domain and creating many email addresses.我還沒有看到一個工程一樣好簡單,擁有一個域和創造許多的電子郵件地址。 One for each site I visit.一為每個網站我訪問。 Like the one I used here.像一個我用在這裡。 If I start getting spam from that address, I simply forward it to如果我可以開始接收垃圾郵件從該地址,我只轉發給 null@null.net and that’s that.這就是這一點。 I have about 30 email addresses generating well over 250 spams a day.我有大約30的電子郵件地址生成,以及超過250個垃圾郵件1天。 They are all being forwarded to他們都被轉交給 null@null.net (Sure hope no one ever gets that address). (當然希望沒有人獲得該地址) 。
I *NEVER* give out my main email address to anyone!我* *從來沒有給我的主要電郵地址給任何人! All the non spam addresses get forwarded to my real email account so I can read them and respond to them.所有非垃圾郵件地址,獲得轉交我真正的電子郵件帳戶,所以我可以閱讀和回應。 Sure, at that point my real address get’s sent out.當然,在這一點上我的真實地址得到的發出。 However, it’s not accidently published on the web.不過,這不是意外在網站上公佈。 At least not by posting it on a blog or a web store.至少不張貼對一個博客或網上商店。
October 15th, 2004 at 5:31 am 2004年10月15日在上午05時31分
I facing the same problem. i面臨同樣的問題。 The new genre of spam that I noticed was that a bunch of unrelated words were pushed in at the end of the e-mail.新的遊戲類型的垃圾郵件,我看到的是一群無關的話,推在上月底的電子郵箱。 These words are really rare words gathered from different contexts.這句話真的是罕見的話聚集來自不同背景。
Do you have any suggestions for it?請問你對此有何建議呢?