Welcome to WebHeadStart.org

Web Technologies

Sponsored By

WebHeadStart.org is currently in beta.
Please pardon our appearance as we work to provide you with the most comprehensive reference on today's web technologies.

Interested in advertising on WebHeadStart? Become an advertising partner today!

[WWW-HTML Mailing List Archive Home] [Messages By Thread] [Messages By Date]

Re: Question about web spiders...

From: Peter Kupfer <peter.kupfer@sbcglobal.net>
Date: Sat, 25 Jun 2005 23:47:38 -0500
Message-ID: <42BE336A.6040007@sbcglobal.net>
To: Lachlan Hunt <lachlan.hunt@lachy.id.au>
CC: Jasper Bryant-Greene <jasper@bryant-greene.name>, www-html@w3.org

Lachlan Hunt wrote:
> Jasper Bryant-Greene wrote:
> 

>>
>> I'm not sure about your specific spider, but the commonly accepted way
>> to do what you describe is something like:
>>
>> <a href="http://www.example.org/ " rel="nofollow">Link</a>
> 
> 
> That actually does not do what its name suggests; the spider is free to 
> follow the link.  It was actually designed to indicate that the link 
> should not be counted in the page rank algorithm.
> 
> The correct way to control the way a spider indexes your site is to use 
> robots.txt, assuming the spider in question implements it.

In a robots.txt file can you control specifically what links a spider 
will follow on a certain page, or just that it won't go to a certain 
page. I want the spider to eventually hit each subdomain, just not from 
the home page, I have it start at each subdomain index?

Or, can each subdomain have its own robots.txt.

>> That's perfectly standards compliant, and Googlebot obeys that, as well
>> as several other major spiders AFAIK.
> 
> 
> It is not standards compliant at all.  It's a proprietary extension that 
> just happens to pass DTD based validation.  nofollow was discussed quite 
> extensively on this list when Google introduced it and the vast majority 
> of this community rejected it.

I tried to search the archive, but didn't see it there, why was no 
follow rejected?

Thanks, again please cc to peschtra@yahoo.com as I do not know how to 
subscribe to the list.

-- 
Peter Kupfer
peschtra@yahoo.com
Received on Sunday, 26 June 2005 04:47:45 GMT
Valid XHTML 1.0! Valid CSS! Site Map | Privacy Policy | Terms of Use | WebHeadStart.org © 2005 All Rights Reserved.