Clean Room Implementation of Google Page Rank AlgorithmAugust 17th, 2006 Finally a clean-room implementation of Google's Page Rank Algorithm in Java, reverse-engineered from their numerous commentary on Page Rank (or is it Pigeon Rank?). public static int getPageRank(url) {
// start off with a random low PR
int pageRank = rand.getInt(0, 3);
if ( isHostedOn('google.com', url) ) {
pageRank++;
} else if ( isHostedOn('microsoft.com', url) ) {
pageRank--;
}
// Support valid pages
if (isValidPage(url) ) {
pageRank += 1;
}
tag_value['b'] = 1;
tag_value['h2'] = 2;
tag_value['h1'] = 3;
tag_value['strong'] = -1; // W3C sux!
pageRank = calculateTagsPR(tag_value, pagerank);
// Sergey said good news sites have
// lots of nested tables
tablesOnPage = getTagCount('table');
if (tablesOnPage >= 50) {
pageRank += 2;
}
if (pageRank >= 5) {
pageRank = 4; // helps selling AdWords
}
if (linksFrom('mattcutts.com', url) >= 4) {
// I link to "clean" sites only
// ? Matt, Feb 2006
pagerank += 2;
}
pagerank += countBacklinks(url) / 10000;
blacklist1 = getList('c:\chinese-government-censored.txt');
blacklist2 = getList('c:\larry-page-hatelist.txt');
if ( inArray(blacklist1, url) || inArray(blacklist2, url) ) {
pageRank = 0;
}
d = dashesInUrl(url);
pageRank = (d >= 3) ? pageRank -1 : pageRank + 1;
if (inString(url, "how to build a bomb")) {
// added on request.
Google is hiring; Location MoonJuly 2nd, 2005 An excellent job opportunity for working in Google at Moon (yes The Moon). Google is interviewing candidates for engineering positions at our lunar hosting and research center, opening late in the spring of 2007.
Is Google punishing weblogs aka blogs?April 13th, 2005 Nowadays I have noticed that Google apparently confers pagerank only to the front page of the blog. All other pages appear to have a rank of zero! I have noticed it across a wide spectrum of blogs, even blogs with very high pageranks.
I Said No To Text-Link-AdsJuly 5th, 2007 When I signed up for text-link-ads several months earlier, I meticulously read the terms and conditions. Nowhere was it mentioned that you couldn't use nofollow on your links.
Interesting Articles & Announcements...February 15th, 2008 Interesting articles and announcements emailed to me -
1. Compilation of BlackBerry Tips, Guides & Resources - link.
Copy the Exact Text with Link URL, Quote, Page Title etc through QuoteURLTextNovember 25th, 2008 Suppose you want to copy and paste texts from a web page. Though the data can easily be copied (unless that particular website has copy protection) as plain text and pasted to yours, its painful to take all the metadata alongside it.
YouTube Sued Over 1992 Los Angeles Riot VideoJuly 18th, 2006 L.A. News Service, a Los Angeles video news service sued YouTube Inc. on Friday in federal court for allowing its users to upload copyrighted video footage on YouTube, including the beating of trucker Reginald Denny during the 1992 riots.
Google Proxy Hacking - How Your Page Rank Can Be Stolen & Pages Removed from SERPFebruary 23rd, 2008 I recently came across an instance of Google Proxy hacking with one of my clients, which removed his index page and other pages from SERP (Search Engine Ranking & Positioning) and he lost the page ranks (went down to zero). We were asked to protect his site against Google Proxy hacking, a really dangerous technique which can not only cause you to loose page rank but also remove your pages from SERP, all because Google cannot properly identify original pages from duplicates.
Top 10 Firefox Add-ons for Any BloggerMarch 9th, 2009 Certainly, there are a number of web browsers available for bloggers. Why does Firefox still rule the roost? Well the answer is simple - the add-ons.
Free Online Store Setup Using Google Base & Google Checkout in 6 StepsAugust 17th, 2006 6 steps to setup your free online store using Google Base & Google Checkout. Click on the "Sign In" link on the Google Base homepage.
Google AdSense Changes Its Own Rule for ImDB?September 3rd, 2008 I don't get Google AdSense. Last year with great fanfare they announced making the text of the ad non-linkable while keeping the title and URL linkable.
Mega Challenge for Google Search Engine - Text-Link-AdsJuly 3rd, 2007 Google and other major search engines (Yahoo, MSN Search etc.) which use back-links for importance calculation of a web page suffers from a fundamental flaw. Back-links can be manipulated and even purchased.
Google Opens Up Search Engine SecretsJune 28th, 2006 Google is pledging to demystify the hidden workings of its search engine while refocussing on its core business. Google admitted that it has come to be perceived as overly secretive in the way it collects and orders information.
Google Goodies for Firefox UsersJuly 11th, 2005 Assorted list of latest products from Google targeted at Firefox users.
Report on Ethics in Blogging SurveyJuly 21st, 2005 Report Summary:
As the prevalence and social influence of weblogs continue to increase, the issue of the ethics of bloggers is relevant not only to the blogging community, but also to people outside it. This study explored ethical beliefs and practices of two distinct groups of bloggers--personal and non-personal--through a worldwide web survey.
November 22nd, 2007 at 5:12 pm
I recently received one of those “published updates” from TLA and have contacted their support department to close my TLA account because of it. Unfortunately, no one has responded for three days now.
The first point simply shows the typical business sense of any advertiser. It basically says, “We don’t care about your site. We just want people to click on our ads.” While I normally wouldn’t be bothered by this, TLA was founded on the principle that advertisements should be unobtrusive, but this recommendation seems to go against that principle. Scott Yang recently received a request to “change the ‘Sponsored Links’ text to something like ‘Friends’ or something along those lines,” which further betrays their company’s original ideals and crosses a line by requesting that the publisher intentionally deceive his audience. These links don’t represent friends, they represent sponsors.
The last point is very puzzling. TLA’s referral system allows you to create “Advertise Here” (and similar) links which link back to your listing at TLA, and even though this option still exists, they are very to-the-point about removing such links. Furthermore, these links help to direct interested parties towards TLA and eventually result in more revenue for both the publisher and TLA. This request does not seem practical, unless they are planning to close TLA, or are desperate to decrease ad sales.
The online advertising industry is slowly becoming confusing and obtrusive again. In the long-run, my blog doesn’t need commercial sponsors to stay alive. I left AdSense two months ago, and now it’s time to leave TLA before things get any worse.
November 25th, 2007 at 10:14 am
Angusman, thanks for the couple of posts about TLA. I had just been looking at them as a way to boost the PR of my blog. However, I do not want to venture into anything remotely grey or black, in terms of gaining traffic and an audience.
It is sad that advertisers expect bloggers to sell out, just like any other company. But that is ignoring the basis of blogging: honesty from your fellow person.
Thanks again!
November 26th, 2007 at 11:37 pm
After not receiving a response for about a week (see first comment), I decided to re-send my request and got a reply in only 40 minutes (go figure). Unfortunately, since closing your account now constitutes a “breach of contract”, I will not be compensated for the past month of ad display. You dodged the bullet on that one, Angsuman!
November 27th, 2007 at 10:05 am
Nice info. Thanx for sharing and giving the good tips. I just started my blog, and planning to put Google ads when I have more visitor. Perhaps you could give me a support. Thanx a bunch
December 6th, 2007 at 3:56 pm
I just got into a bit of a tiff with TLA myself about this… I would prefer not to have to resort to TLA and would like more obvious forms of sponsorship, but I make very clear through my use of these google-identifiable disclaimers just what is going on.
December 12th, 2007 at 9:13 am
You can cancel - just call the publisher support line and tell them to cancel it. They did it for a friend of mine.
June 1st, 2008 at 1:47 pm
[...] policy changes. The first was down-right unethical, and the second was down-right confusing. Angsuman discussed these policy changes a few days ago, but now that I’m free of their grasp, I think it’s time to share my [...]
November 2nd, 2008 at 3:10 pm
No matter what somebody will do to avoid the google penaly, google will always find a way to penalise all these sites. Sooner or later.
March 7th, 2009 at 10:15 pm
I was looking for any tips about TLA and I found yours. I guess I’ll think twice before putting sponsored ads on my blog, thanks for sharing
April 8th, 2009 at 4:00 am
balls with the google, they don’t want other advertising company to come up, they just want to rule
the web, bullshit, this is what monopoly game play, it
is bloggers or publisher and other advertiser should really ignore this page rank thing, instead they should look at the traffic values, and judge a site or
a blog by its values, why page rank, points to ponder, does PR pays your hosting bills? damn it, adsense hardly pays, and now they are trying to dump paid links company, bullshit, whats wrong with the google?