Go4Expert

Go4Expert (http://www.go4expert.com/)
-   Internet Marketing (http://www.go4expert.com/articles/internet-marketing/)
-   -   Robots.txt Files (http://www.go4expert.com/articles/robotstxt-files-t16034/)

CircuitX 3Feb2009 03:10

Robots.txt Files
 

Introduction



This is a very basic tutorial about robots.txt files. Alot of people on hackthis.co.uk have trouble with it, so i'm gonna do a tutorial here for entry level hackers all over the web.

So lets get started.

Background



Search engines such as "Google" or "Yahoo" find the websites which they ultimately find for you using something called a "search bot" or "web crawler". It searches the web in search of websites basically, and so search engines use them to index their sites.

How This Is Related to Hacking



If a website has pages which it doesn't want search engines to find, then it can index the sites that search bots are excluded from in a ".txt" file. This is called a "robots.txt".

The pages indexed in a "robots.txt" page could potentially contain information concerning usernames, passwords, personal details etc. (the information would probably be encrypted, but this isn't a decryption tutorial :p). So basically, if we find the robots.txt file, then we find a list of secret webpages for a particular site.

The best bit is, is that there can only be one "robots.txt" fine for each website. So say it was for go4expert. The URL would be - http://www.go4expert.com/robots.txt

The "robots.txt" would look a bit like this:
Code:

User-agent: *
Disallow: /

The "/" tells the search robots to ignore any page on this website.
However it could look like this:
Code:

User-agent: *
Disallow: uernamespasswords.txt

This would mean that search engines ignore just one page...

Summary: And Further Reading



robotstxt.org - all about "robots.txt" files.
hackthis.co.uk - a great website to learn hacking in a legal, user friendly enviroment. Main Level 7 is all about "robots.txt".

Remember guys: KEEP IT LEGAL

DISCLAIMER - I WILL NOT BE HELD RESPONSIBLE FOR THE ACTIONS OF ANYONE WHO READS THIS TUTORIAL. IT IS FOR EDUCATIONAL PURPOSES ONLY.

shabbir 3Feb2009 07:37

Re: Robots.txt Files
 
Good to see your first Article but I would have preferred this to be in Search Engines and so moved to Search Engine with a permanent redirect in Hacking forum as well.

CircuitX 3Feb2009 13:20

Re: Robots.txt Files
 
Quote:

Originally Posted by shabbir (Post 42339)
Good to see your first Article but I would have preferred this to be in Search Engines and so moved to Search Engine with a permanent redirect in Hacking forum as well.

Ok, sorry about that :nice:.

shabbir 3Feb2009 14:01

Re: Robots.txt Files
 
Quote:

Originally Posted by CircuitX (Post 42350)
Ok, sorry about that :nice:.

I guess the more relevancy and less relevancy. There is nothing as such to feel sorry about it.

stephen186 21Feb2009 12:48

Re: Robots.txt Files
 
anywasy, even if it is not suppose to be posted here.......i think some newbie webmasters should also know this.......may be they can prevent to hack their secure the pages at least in this way.

CircuitX 21Feb2009 15:40

Re: Robots.txt Files
 
Quote:

Originally Posted by stephen186 (Post 43255)
anywasy, even if it is not suppose to be posted here.......i think some newbie webmasters should also know this.......may be they can prevent to hack their secure the pages at least in this way.

Ha, i'm a bit of a newbie webmaster myself. I haven't bothered to sort out security yet. I think i'll wait till its finished.

So if you guys want to hack someone - hack me!!! :p

stephen186 21Feb2009 16:10

Re: Robots.txt Files
 
well that's good for you. when i started in internet, i did not know this. I only came to know of all this stuff after reading and participating in forums like this. Even, i am still learning day by day.

shabbir 4Mar2009 09:56

Re: Robots.txt Files
 
Nominate this article for Article of the month for February 2009

imrantechi 17Mar2009 10:27

Re: Robots.txt Files
 
Will use it in right way...

shabbir 17Mar2009 12:16

Re: Robots.txt Files
 
Vote for this article for Article of the month February 2009


All times are GMT +5.5. The time now is 06:04.