How to create Robots.txt file for a website ?

Started by Peter97, August 29, 2016, 03:59:20 AM

Peter97

Hi Friends,

How to create Robots.txt file for a website?

Mentorshouse

You must apply the following saving conventions so that Googlebot and other web crawlers can find and identify your robots.txt file:

    You must save your robots.txt code as a text file,
    You must place the file in the highest-level directory of your site (or the root of your domain), and
    The robots.txt file must be named robots.txt.

As an example, a robots.txt file saved at the root of example.com, at the URL address http://www.example.com/robots.txt, can be discovered by web crawlers, but a robots.txt file at http://www.example.com/not_root/robots.txt cannot be found by any web crawler.

Mentorshouse

Back in 2009 (was it really that long ago?!) Rand wrote a post titled Perfecting Keyword Targeting and On-Page Optimization, which is one of the most popular blog posts on SEOmoz. It is still referenced as much today as it was back in 2009. The core principles haven't changed that much, but there are some new additions to an SEO's toolkit when it comes to on-page optimization. Today I want to focus on what these new additions are in relation to eCommerce websites.

RH-Calvin

Robots.txt is a text file that lists webpages which contain instructions for search engines robots. The file lists webpages that are allowed and disallowed from search engine crawling.
Cheap VPS | $1 VPS Hosting
Cheap Dedicated Servers | Free Setup with IPMI

pablohunt2812

How to Create a Robots.txt file
You can use a robots.txt file to control which directories and files on your web server a Robots Exclusion Protocol (REP)-compliant search engine crawler (aka a robot or bot) is not permitted to visit, that is, sections that should not be crawled. It is important to understand that this not by definition implies that a page that is not crawled also will not be indexed. To see how to prevent a page from being indexed see this topic.
STEPS
Identify which directories and files on your web server you want to block from being crawled by the crawler
Identify whether or not you need to specify additional instructions for a particular search engine bot beyond a generic set of crawling directives
Use a text editor to create the robots.txt file and directives to block content
Optional: Add a reference to your sitemap file (if you have one)
Check for errors by validating your robots.txt file
Upload the robots.txt file to the root directory of your site
STEP DETAIL
Identify which directories and files on your web server you want to block from the crawler
Examine your web server for published content that you do not want to be visited by search engines.
Create a list of the accessible files and directories on your web server you want to disallow.Example You might want to have bots ignore crawling such site directories as /cgi-bin, /scripts, and /tmp (or their equivalents, if they exist in your server architecture).
Identify whether or not you need to specify additional instructions for a particular search engine bot beyond a generic set of crawling directives
Examine your web server's referrer logs to see if there are bots crawling your site that you want to block beyond the generic directives that apply to all bots.

More About Our Webmaster Tools for Windows and Mac

HTML, image, video and hreflang XML sitemap generatorA1 Sitemap Generator
      
website analysis spider tool for technical SEOA1 Website Analyzer
      
SEO tools for managing keywords and keyword listsA1 Keyword Research
      
complete website copier toolA1 Website Download
      
create custom website search enginesA1 Website Search Engine
      
scrape data into CSV, SQL and databasesA1 Website Scraper