New robots.txt tool

New robots.txt tool: “The Sitemaps team just introduced a new robots.txt tool into Sitemaps. The robots.txt file is one of the easiest things for a webmaster to make a mistake on. Brett Tabke’s Search Engine World has a great robots.txt tutorial and even a robots.txt validator.

Despite good info on the web, even experts can have a hard time knowing with 100% confidence what a certain robots.txt will do. When Danny Sullivan recently asked a question about prefixing matching, I had to go ask the crawl team to be completely sure. Part of the problem is that mucking around with robots.txt files is pretty rare; once you get it right once, you usually never have to think about the file again. Another issue is that if you get the file wrong, it can have a large impact on your site, so most people don’t mess with their robots.txt file very often. Finally, each search engine has slightly different extra options that they support. For example, Google permits wildcards (*) and the “Allow:” directive.

The nice thing about the robots.txt checker from the Sitemaps team is that it lets you take a robots.txt file out for a test drive and see how the real Googlebot would handle a file. Want to play with wildcards to allow all files except for ‘*.gif’? Go for it. Want to experiment with upper vs. lower case? Answer: upper vs. lower case”

Leave a Reply

Your email address will not be published.