Go4Expert

Go4Expert (http://www.go4expert.com/)
-   Search Engine Optimization (SEO) (http://www.go4expert.com/forums/seo-forum/)
-   -   what is robot.txt?? (http://www.go4expert.com/forums/what-is-robottxt-t26926/)

mark joshef 13Oct2011 15:51

what is robot.txt??
 
what is robot.txt??

neeraj_77 13Oct2011 15:57

Re: what is robot.txt??
 
Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note “Please, do not enter” on an unlocked door – e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too na´ve to rely on robots.txt to protect it from being indexed and displayed in search results.

shabbir 13Oct2011 16:09

Re: what is robot.txt??
 
Pretty much meaningless posts just for signature spam. I had to ban you once again.

harrysom 15Oct2011 12:55

Re: what is robot.txt??
 
Robot.txt file is a simply text file.the purpose of robot.txt file is to tell the search engine to not crawl a page,which is robot.txt file.in a simple manner,search engine will not visit a page of your site,if you write a robot.txt command in a page.

benivolentsoft 18Oct2011 11:31

Re: what is robot.txt??
 
"Robots.txt is a text file that has a special meaning to the majority of "honorable" robots on the web. By defining a few rules in this text file, you can instruct robots to not crawl and index certain files, directories within your site, or at all. For example, you may not want Google to crawl the /images directory of your site, as it's both meaningless to you and a waste of your site's bandwidth. "Robots.txt" lets you tell Google just that."

bobwarner01 18Oct2011 12:02

Re: what is robot.txt??
 
Robot are text files as we can see its extension but on the other hand we can say that robot always remains robot they react according to instruction We have some parts in website which are private we not want to be crawled by robot.

TM-Ali 19Oct2011 18:46

Re: what is robots.txt?
 
Robots.txt is a file which is use to give the instructions to the robots of search engine. We can allow and disallow the robots of search engine on a particular folder and page.

Example:-

User-agent:*
Disallow:/Folder Name/
Disallow:/page-name.html

ozsubasi 2Apr2012 18:08

Re: what is robot.txt??
 
Quote:

Originally Posted by sandrajolly (Post 93930)
Robots.txt file does not improve your search engine positioning.
It provides robots with information concerning which files you will not allow to be crawled and indexed in the search engines.
When the search engine robot crawls your site it looks for the robots.txt file.
If it doesn't find one it assumes automatically that it may crawl and index the entire site.

This allows all robots to crawl all files.
User-agent: *
Disallow:

This Disallows all robots to crawl a folder called /cmsbuffet/ .
User-agent: *
Disallow: /cmsbuffet/

The clue to where this is copied from is in the reference to "cmsbuffet"
It originally comes from:
http://www.cmsbuffet.com/robots-txt-check.php

ozsubasi 10Apr2012 17:20

Re: what is robot.txt??
 
The OP was banned for asking this (and other) silly question, and I think it has been fully answered.
So to any else who visits this thread, please read it first and only post to it if you have something new and relevant to say.

ozsubasi 11Apr2012 16:16

Re: what is robot.txt??
 
Quote:

Originally Posted by sachinseo (Post 94143)
robots.txt is the file which doesnt allow crawlers to a site, for that you need to specify disallow function in webmaster tools and generate the txt file and upload to your server in root directory.

Just to clarify this, these are the instructions from Google:

Generate a robots.txt file using the Create robots.txt tool
On the Webmaster Tools Home page, click the site you want.
Under Site configuration, click Crawler access.
Click the Create robots.txt tab.
Choose your default robot access. We recommend that you allow all robots, and use the next step to exclude any specific bots you don't want accessing your site. This will help prevent problems with accidentally blocking crucial crawlers from your site.
Specify any additional rules. For example, to block Googlebot from all files and directories on your site:
In the Action list, select Disallow.
In the Robot list, click Googlebot.
In the Files or directories box, type /.
Click Add. The code for your robots.txt file will be automatically generated.
Save your robots.txt file by downloading the file or copying the contents to a text file and saving as robots.txt. Save the file to the highest-level directory of your site. The robots.txt file must reside in the root of the domain and must be named "robots.txt". A robots.txt file located in a subdirectory isn't valid, as bots only check for this file in the root of the domain. For instance, http://www.example.com/robots.txt is a valid location, but http://www.example.com/mysite/robots.txt is not.

(Source: http://support.google.com/webmasters...&answer=156449)


All times are GMT +5.5. The time now is 09:37.