1. We have moved from vBulletin to XenForo and you are viewing the site in the middle of the move. Though the functional aspect of everything is working fine, we are still working on other changes including the new design on Xenforo.
    Dismiss Notice

what is robot.txt??

Discussion in 'Search Engine Optimization (SEO)' started by mark joshef, Oct 13, 2011.

Thread Status:
Not open for further replies.
  1. mark joshef

    mark joshef Banned

    what is robot.txt??
     
  2. neeraj_77

    neeraj_77 New Member

    Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note “Please, do not enter” on an unlocked door – e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too naïve to rely on robots.txt to protect it from being indexed and displayed in search results.
     
  3. shabbir

    shabbir Administrator Staff Member

    Pretty much meaningless posts just for signature spam. I had to ban you once again.
     
  4. harrysom

    harrysom New Member

    Robot.txt file is a simply text file.the purpose of robot.txt file is to tell the search engine to not crawl a page,which is robot.txt file.in a simple manner,search engine will not visit a page of your site,if you write a robot.txt command in a page.
     
  5. benivolentsoft

    benivolentsoft New Member

    "Robots.txt is a text file that has a special meaning to the majority of "honorable" robots on the web. By defining a few rules in this text file, you can instruct robots to not crawl and index certain files, directories within your site, or at all. For example, you may not want Google to crawl the /images directory of your site, as it's both meaningless to you and a waste of your site's bandwidth. "Robots.txt" lets you tell Google just that."
     
  6. bobwarner01

    bobwarner01 New Member

    Robot are text files as we can see its extension but on the other hand we can say that robot always remains robot they react according to instruction We have some parts in website which are private we not want to be crawled by robot.
     
  7. TM-Ali

    TM-Ali New Member

    Re: what is robots.txt?

    Robots.txt is a file which is use to give the instructions to the robots of search engine. We can allow and disallow the robots of search engine on a particular folder and page.

    Example:-

    User-agent:*
    Disallow:/Folder Name/
    Disallow:/page-name.html
     
  8. ozsubasi

    ozsubasi New Member

    The clue to where this is copied from is in the reference to "cmsbuffet"
    It originally comes from:
    http://www.cmsbuffet.com/robots-txt-check.php
     
  9. ozsubasi

    ozsubasi New Member

    The OP was banned for asking this (and other) silly question, and I think it has been fully answered.
    So to any else who visits this thread, please read it first and only post to it if you have something new and relevant to say.
     
    Last edited: Apr 13, 2012
  10. ozsubasi

    ozsubasi New Member

    Just to clarify this, these are the instructions from Google:

    Generate a robots.txt file using the Create robots.txt tool
    On the Webmaster Tools Home page, click the site you want.
    Under Site configuration, click Crawler access.
    Click the Create robots.txt tab.
    Choose your default robot access. We recommend that you allow all robots, and use the next step to exclude any specific bots you don't want accessing your site. This will help prevent problems with accidentally blocking crucial crawlers from your site.
    Specify any additional rules. For example, to block Googlebot from all files and directories on your site:
    In the Action list, select Disallow.
    In the Robot list, click Googlebot.
    In the Files or directories box, type /.
    Click Add. The code for your robots.txt file will be automatically generated.
    Save your robots.txt file by downloading the file or copying the contents to a text file and saving as robots.txt. Save the file to the highest-level directory of your site. The robots.txt file must reside in the root of the domain and must be named "robots.txt". A robots.txt file located in a subdirectory isn't valid, as bots only check for this file in the root of the domain. For instance, http://www.example.com/robots.txt is a valid location, but http://www.example.com/mysite/robots.txt is not.

    (Source: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449)
     
    1 person likes this.
  11. ozsubasi

    ozsubasi New Member

    Yes, for example my site has an admin section which is for my use only and which is disallowed by robots.txt
     
  12. ozsubasi

    ozsubasi New Member

    I think we have exhausted the possibilities on this subject. Thread closed.
     
Thread Status:
Not open for further replies.

Share This Page