Finding all pages within a site?

da6shar's Avatar, Join Date: Nov 2006
Newbie Member
How would I go about finding all "/_____.htm" pages within a site, where '_____' would be replaced by the page's filename?

For example, say all I knew was "google.com". How would I find every ".htm" page that existed on Google.com, such as "google.com/index.htm", "google.com/games.htm", etc.?

Is there a program that can do this for me? Thanks!
0
jamieplucinski's Avatar
Newbie Member
There are several web spider tools, or site backup tools that follow links on pages to create a site map. Aside from this I don't see any other option without either A) Finding a site hosted by someone who has turned directory listing on, and not set a default document (always fun) B) Typing common names for site map files that could be filled with goodies. or C) Take a look at the source of a web page, you'll often find includes (CSS, IFRAMES, JS) or HTML comments that can lead you to potential treasure troves. You could also try running a Google search for inurl:whatever.com which will then bring up a list of everything Google has found and indexed in their intensive and extensive site crawling.
0
carastas's Avatar, Join Date: Nov 2006
Newbie Member
you can use something like Xenu link checker http://home.snafu.de/tilman/xenulink.html it'll create a sitemap for that site and check for 404 errors! it's free...

George
0
evileye's Avatar, Join Date: Jan 2007
Contributor
inurl: www.site.com

will do the trick !

wre : www.site.com is the page u want to search..... works if the site is indexed by google bot