Sitemaps Autodiscovery With Robots Text FileEver since the beginning of internet and search engines, the
robots.txt file has been how website owners and their webmasters could tell search engine crawlers like GoogleBot which pages and content should be ignored and left out of search results.This was the situation for many years until Google created
Google Sitemaps. (This was later named
XML Sitemaps Protocol as other search engines joined.)New functionality called
Sitemaps Autodiscovery was added to
robots.txt file that made it possible to point search engines to your XML sitemaps. Thus search engine bots can, when they have downloaded and read the
robots.txt file, automatically discover and retrieve XML sitemap files located on websites.
Note: In this tutorial we are creating sitemaps with our own tool, A1 Sitemap Generator (http://www.microsystools.com/products/sitemap-generator/)
Submitting Your XML SitemapWhen Google first accounced their sitemaps, it was necessary to create and verify a
Google Webmaster Tools account associated to the website containing the site map file. In addition you had to submit the site map files manually through their web interface. Now, instead of doing it manually for each search engine, you can
essentially submit your sitemaps simply by updating them.
When done scanning your website and building the XML sitemap, the sitemapper software (http://www.microsystools.com/products/sitemap-generator/) can also make the robots.txt file with the correct and full path to your XML sitemap.
To include support for XML sitemaps auto discovery in the robots.txt file, all you need to do is add the fully qualified XML sitemap file path like this:Sitemap: http://www.example.com/sitemap.xml
Sample robots.txt File for XML Sitemaps AutodiscoveryIf have created a standard
sitemap file:User-agent:
*Disallow:
Sitemap: http://www.example.com/sitemap.xml
If you have created a
sitemap index file, you can also reference that:
User-agent: *
Disallow:Sitemap: Sitemap: http://www.example.com/sitemap-index.xml
Manual XML Sitemaps SubmissionThere are still valid reasons why submitting your sitemaps manually the first time can be a good idea. One such reason is to get started using the different webmaster tools provided by the search engines:
- Bing Webmaster Tools (MSN) (http://htp:/www.bing.com/toolbox/webmaster)
- Google Webmaster Tools (http://www.google.com/webmasters/tools/)
Cross Submit Sitemaps for Multiple WebsitesIn the beginning, and for a long time after that, it was not possible to submit sitemaps for websites unless the sitemaps were hosted and located on the same domain as the websites. However, now some include support for new ways of managing sitemaps across multiple sites and domains. The requirement is that you need to verify ownership of all websites in Google Webmaster Tools or similar depending on the search engine:
- Sitemaps protocol (http://www.sitemaps.org/protocol.php#location): Cross sitemaps submit and manage using robots.txt.
- Google (http://www.google.com/support/webmasters/bin/answer.py?answer=75712): More website verification methods than sitemaps protocol defines.
Ping Search Engines When Updating XML SitemapsAfter the initial submission of your website XML sitemap file, here is a list of the steps you need to do when updating your website and sitemap:
- Build new sitemap (http://www.microsystools.com/products/sitemap-generator/help/xml-sitemap-generator-tutorial/)
- Upload created sitemap (http://www.microsystools.com/products/sitemap-generator/help/sitemap-generator-ftp-upload/)
- Ping search engines (http://www.microsystools.com/products/sitemap-generator/help/xml-sitemap-updated-ping/)