What Matt Mullenweg (WordPress Author) Knows About You (WordPress & Akismet Plugin User)什麼馬特mullenweg (在WordPress作者)知道你(在WordPress & akismet插件用戶)
I took a look at the data we are sending to Akismet, a WordPress plugin for comment spam protection, for each comment submitted on your blog, if you use this plugin for comment spam prevention.我一看數據我們發送給akismet , wordpress插件為垃圾評論的保護,為每個評論提交的關於您的博客,如果您使用此插件為垃圾評論的預防。 I have recently最近我曾 started using Akismet開始使用akismet , a WordPress plugin from WordPress author , wordpress插件從作者的WordPress Matt Mullenweg馬特mullenweg . 。 I have to say I was surprised at the copious amount of data, some sensitive, being sent to Matt’s server for handling every single comment.我必須說我很驚訝於,再用大量的數據,一些敏感的,被派往馬特的服務器處理每一個單一的評論。
Tons of useless (for spam protection) information is being sent for every comment, most of which rarely, if ever, changes on a server.噸的無用(垃圾郵件防護)的資料,正在發送的每個評論,其中大部分很少,如果以往的變化,在服務器上。
Here are the data that was sent to Akismet server for a single test comment on my blog.這裡的數據被送往akismet服務器為一個單一的測試發表評論我的博客。 I have commented on them inline.我對他們的評論,內插。
comment_post_ID=1128 // Why does he need this? comment_post_id = 1128 / /為什麼他是否需要這個?
comment_author=Angsuman+Chakraborty comment_author =由Angsuman +查敏
comment_author_email=angsuman%40taragana.com comment_author_email =由Angsuman % 40taragana.com
comment_author_url=http%3A%2F%2Fblog.taragana.com%2F comment_author_url =的HTTP %第3 A % 2樓% 2fblog.taragana.com % 2樓
comment_content=[Actual comment] comment_content = [實際評論]
comment_type= comment_type =
user_ID=1 // Why does he need this? user_id = 1 / /為什麼他是否需要這個?
user_ip=59.93.245.60 user_ip = 59.93.245.60
user_agent=[Truncated] user_agent = [截斷]
referrer=[Truncated - Post url] 引薦= [截斷-郵政網址]
blog=http%3A%2F%2Fblog.taragana.com 博客=的HTTP %第3 A % 2樓% 2fblog.taragana.com
CONTENT_LENGTH=98 content_length = 98
// Isn’t it obvious? / /是不是很明顯嗎? Why send it?為什麼它傳送? Does it ever change?難道以往任何時候都改變?
CONTENT_TYPE=application%2Fx-www-form-urlencoded內容類型=應用% 2fx - WWW的形式- urlencoded
// What is he doing with it? / /什麼是他做的與它呢? This information is useless for spam protection.此信息是無用的垃圾郵件防護。
DOCUMENT_ROOT=[File system path] DOCUMENT_ROOT在= [文件系統路徑]
// Why does he need this? / /為什麼他是否需要這個? Yet another useless junk.又一無用的垃圾。
HTTP_ACCEPT=[Truncated] http_accept = [截斷]
// Why does he need this? / /為什麼他是否需要這個?
HTTP_ACCEPT_CHARSET=[Truncated] http_accept_charset = [截斷]
HTTP_ACCEPT_LANGUAGE=en-us%2Cen%3Bq%3D0.5 http_accept_language = -我們% 2cen % 3bq % 3d0.5
// Why does he need this? / /為什麼他是否需要這個?
HTTP_CONNECTION=keep-alive http_connection =保持活著
HTTP_HOST=blog.taragana.com http_host = blog.taragana.com
// Why does he need this? / /為什麼他是否需要這個?
HTTP_KEEP_ALIVE=300 http_keep_alive = 300
HTTP_REFERER=[Truncated] http_referer = [截斷]
HTTP_USER_AGENT=[Truncated] http_user_agent = [截斷]
// Why does he have to have my PATH information? / /為什麼他是否有有我的路徑信息呢?
PATH=[PATH environment variable]路徑= [ PATH環境變量]
REMOTE_ADDR=59.93.245.60 remote_addr = 59.93.245.60
REMOTE_PORT=1567 remote_port = 1567
// How many times does it change on a server? / /多少次,是否改變在服務器上呢? Why does he need it?為什麼他是否需要它?
// It contains file system information / /它包含文件系統信息
SCRIPT_FILENAME=[Truncated] script_filename = [截斷]
// How many times does it change on a server? / /多少次,是否改變在服務器上呢?
SERVER_ADDR=69.36.187.98 server_addr = 69.36.187.98
// How many times does it change on a server? / /多少次,是否改變在服務器上呢? Why does he need it?為什麼他是否需要它?
SERVER_ADMIN=Postmaster%40taragana.com server_admin =郵政% 40taragana.com
SERVER_NAME=blog.taragana.com服務器= blog.taragana.com
// How many times does it change on a server? / /多少次,是否改變在服務器上呢? What does he need it for?是什麼,他是否需要它呢?
SERVER_PORT=80 server_port = 80
// How many times does it change on a server? / /多少次,是否改變在服務器上呢? What does he need it for?是什麼,他是否需要它呢?
SERVER_SIGNATURE=[Truncated] server_signature = [截斷]
// How many times does it change on a server? / /多少次,是否改變在服務器上呢? What does he need it for?是什麼,他是否需要它呢?
SERVER_SOFTWARE=[Truncated] server_software = [截斷]
// How many times does it change on a server? / /多少次,是否改變在服務器上呢? What does he need it for?是什麼,他是否需要它呢?
GATEWAY_INTERFACE=CGI%2F1.1 gateway_interface =的CGI % 2f1.1
// How many times does it change on a server? / /多少次,是否改變在服務器上呢? What does he need it for?是什麼,他是否需要它呢?
SERVER_PROTOCOL=HTTP%2F1.1 server_protocol =的HTTP % 2f1.1
// How many times does it change on a server? / /多少次,是否改變在服務器上呢? What does he need it for?是什麼,他是否需要它呢?
// This is always POST! / /這是始終郵政!
REQUEST_METHOD=POST request_method =後
// How many times does it change on a server? / /多少次,是否改變在服務器上呢? What does he need it for?是什麼,他是否需要它呢?
QUERY_STRING= query_string =
// How many times does it change on a server? / /多少次,是否改變在服務器上呢? What does he need it for?是什麼,他是否需要它呢?
REQUEST_URI=%2Fwp-comments-post.php request_uri = % 2fwp -評論- post.php
// How many times does it change on a server? / /多少次,是否改變在服務器上呢? What does he need it for?是什麼,他是否需要它呢?
SCRIPT_NAME=%2Fwp-comments-post.php script_name = % 2fwp -評論- post.php
// Why does he need to know where I installed WordPress on my server? / /為何他要知道我安裝的WordPress在我的伺服器上?
PATH_TRANSLATED=[Truncated] path_translated = [截斷]
// How many times does it change on a server? / /多少次,是否改變在服務器上呢? What does he need it for?是什麼,他是否需要它呢?
PHP_SELF=%2Fwp-comments-post.php php_self = % 2fwp -評論- post.php
// This is inane / /這是inane
argv=Array argv =陣列
// This is inane / /這是inane
argc=0 argc = 0
This huge amount of data (considering it is send for every comment) can consume a not-so-insignificant portion of your bandwidth quota, if you get lots of spam.這個龐大的數據量(考慮到這是發送的每個評論)可以消耗沒有那麼微不足道的一部分,請在帶寬配額,如果您收到大量的垃圾郵件。
It is clear Matt & Co. haven’t taken the effort to filter out the unnecessary information, even though they can easily do so.很顯然,馬特公司沒有採取的努力,過濾掉不必要的信息,即使他們可以輕鬆地這樣做。
Some of these information may also be used by hackers (bad ones).一些這些資料也可能被黑客(壞人) 。 Remember all information is submitted over the internet in cleartext.記住所有的資料是提交了在互聯網上明文。
Kind of makes you feel warm and fuzzy, doesn’t it?種讓你感到溫暖和模糊,不是嗎?
Filed under提起下 CMS Software CMS軟件 , , Headline News頭條新聞 , , Pro Blogging贊成Blogging , , Web網頁 , , Web Services Web服務 , , WordPress在WordPress | |
| |
RSS 2.0 2.0 | |
Trackback Trackback跟踪 this Article |此文章|
Email this Article電子郵件此文章
You may also like to read您也可以想讀 |




April 8th, 2006 at 11:03 pm 2006年4月8日在下午11時03分
Akismet’s privacy policy is available to the public here (legal translation coming soon): akismet的隱私權政策,是向公眾提供在這裡(法律翻譯即將推出的) :
http://akismet.com/privacy/
Matt would [probably] be glad if you were to contact him with your privacy/security concerns.馬特會[可能]很高興,如果你要與他聯絡,與您的隱私/安全關切。 If you send your inquiry through如果您發送您的查詢通過 the Akismet contact form該akismet聯繫表 , he’ll usually respond within the week. ,他通常會作出回應,一周內。
April 9th, 2006 at 6:00 pm 2006年4月9日下午6時
We do strip out potentially sensitive data, like your login cookie.我們帶出潛在的敏感數據,如您的登錄信息的Cookie 。 The rest is entirely harmless, and actually quite useful in identifying spam.其餘的是完全無害,其實是相當有益的,在確定是垃圾郵件。 You can exclude it, but the effectiveness of Akismet will go down.您可以排除,但成效akismet將下降。
April 10th, 2006 at 9:36 am 2006年4月10日在上午09時36分
Matt,馬特,
Thanks for the clarifications.感謝澄清。 However I couldn’t understand why you need data which never changes for any user like:不過,我不明白為什麼你需要的數據,從來沒有改變,任何用戶,例如:
CONTENT_TYPE=application%2Fx-www-form-urlencoded內容類型=應用% 2fx - WWW的形式- urlencoded
REQUEST_METHOD=POST request_method =後
SERVER_PORT=80 // May very rarely change server_port = 80 / /很可能是很少改變
SERVER_PROTOCOL=HTTP%2F1.1 server_protocol =的HTTP % 2f1.1
GATEWAY_INTERFACE=CGI%2F1.1 gateway_interface =的CGI % 2f1.1
etc.等等。
Also there are several pieces of data which I cannot see (irrespective of the algorithm you are using, which I personally think is a variant of naive bayesian with manual blacklisting也有幾件資料,我看不到(不論該算法的您所使用的,我個人認為是一個變種的樸素貝葉斯手動列入黑名單
) how they can help in analysing spam like my servers SCRIPT_FILENAME or PATH_TRANSLATED. )如何,他們可以幫助在分析垃圾郵件一樣,我的服務器script_filename或path_translated 。
I could see you have a provision in code to filter out certain data from list.我可以看到你有一個規定,在代碼中要篩選掉某些數據,從名單。 Why not use it to get only the data that you need.為什麼不使用它來只能得到的數據,您所需要的。
Looking forward to your response.期待著您的回應。
Best,最好的,
Angsuman由Angsuman
April 10th, 2006 at 9:37 am 2006年4月10日在上午09時37分
James,詹姆斯,
I guess I reached him faster this way我猜我達成他的速度更快,這樣
Thanks for your suggestions.感謝您的建議。
Best,最好的,
Angsuman由Angsuman
April 11th, 2006 at 12:14 pm 2006年4月11日在下午12時14分
[...] In addition, over at Simple Thoughts, Angsuman Chakraborty wrote an interesting post entitled, “What Matt Mullenweg (WordPress Author) Knows About You (WordPress & Akismet Plugin User).” There, he figured out what kind of info Akismet sends back to interpret comments as spam / not spam. [ … … ]此外,超過在簡單的想法,日由Angsuman Chakraborty寫了一個有趣的職位,題目是“什麼馬特mullenweg (在WordPress作者)知道你(在WordPress & akismet插件用戶) 。 ”在那裡,他揣摩什麼樣的信息akismet發回的評論解釋為垃圾郵件/這不是垃圾郵件。 All this was very interesting, but it got my no further to my goal of getting out of Akismet jail.所有這是非常有趣的,但它得到我沒有進一步的向我的目標失控的akismet坐牢。 My identity had been taken by a black box for unknown reasons, and there was no way to get it back.我的身份已採取黑箱作業,原因不明,有沒有辦法取回。 Granted, on the net it is very easy to change your identity, but I had been writing as myself for quite awhile.理所當然的,對淨,這是很容易改變自己的身份,但我已以書面形式作為自己相當一段時間。 Why would I want to give up what little, if any, reputation I have?為什麼我要放棄什麼太少,如有的話,我的聲譽呢? Especially to the black box?特別是黑匣子? [...] [ … … ]
January 16th, 2007 at 8:47 am 2007年1月16日在上午8時47分
I my - maybe simple - views these informations are required for analyzing spam: i我-也許簡單-的意見,這些信息都需要分析垃圾郵件:
comment_content # Yeah, sure… comment_content #是啊,肯定…
comment_author* # All three together comment_author * #所有三個一起
blog_url (a splogger can easily remove that URL, so you still have his server’s IP number. But what about a sblog like spammer-blog.wordpress.com? Got it? IP is useless, two! blog_url (一splogger可以輕鬆地移去該網址,所以,你還有他的服務器的IP數目,但約1 sblog一樣,垃圾郵件發送者- blog.wordpress.com ?得到它呢? IP是沒有用的,二!
And even the client’s IP/user-agent-string are useless because of open proxies.甚至客戶端的IP /用戶代理字符串是無用的,因為開放的代理人。 Yeah, you can blacklist that IP numbers, but how many open proxies exist in the wide world?是啊,您可以列入黑名單的IP號碼,但究竟有多少公開的代理中存在廣泛的世界呢? 100,000 ??? 100000 ? ? ?
Well, I’ll remove all information which you really don’t need to know from my blog (like absolute paths and such).那麼,我將刪除所有資料,你真的不需要知道從我的博客(如絕對路徑等) 。 Only I need to know where your scripts are installed and not you.只是我需要知道您的腳本安裝,而不是你。
I know you can blacklist my ID number so move on.我知道你可以列入黑名單,我的ID號碼,以便繼續前進。 I have more anti-spam plug-ins left to replace with Akismet.我有更多的反垃圾郵件插件離開,以取代與akismet 。
And Akismet isn’t the ultimate death for spam comments, as well.和akismet是不是最終的死亡為垃圾郵件的評論,以及。
I’m not against Matt and all the other people behind Akismet but I really need to know why, why, why you need to know so much useless informations from my blog?我不反對馬特和所有其他人背後的akismet ,但我真的需要知道,為什麼,為什麼,為什麼您需要知道這麼多無用的信息,從我的博客? Why the comment ID why the absolute path of my script installation?為什麼評論身份證,為什麼絕對路徑我的腳本安裝?
So long and all the best,只要和所有最好的,
Roland羅蘭
January 16th, 2007 at 8:50 am 2007年1月16日在上午8時50分
An addition to my previous post. 1 ,除了我以前的職位。 I’m saying this to Matt not to Angsuman.我說,這是馬特不是由Angsuman 。
August 1st, 2007 at 5:53 pm 2007年8月1日在下午5時53分
Don’t forget that Akismet is integrated into other tools too, such as the cakePHP framework so some of that info will be relevant there.不要忘記, akismet是集成到其他工具也如cakephp框架,使一些該信息將有關。
I’m with you on the server path type of thing but the actual calling script is probably important for identifying the weak points (or high traffic points ) on a site.我與你在該服務器上的路徑類型的事情,但實際要求腳本可能是重要的確定薄弱點(或高流量點)的一個網站上。 More for future development than current spam detection.更多的為未來的發展比目前的垃圾郵件檢測。
I wouldn’t be blogging today if it wasn’t for Akismet and Bad Behaviour - as it is I have all comments on moderation anyway… it’s that bad!我不會博客,今天如果不是akismet和壞的行為-因為這是我的所有評論溫和無論如何… …它的壞!