Go4Expert

Go4Expert (http://www.go4expert.com/)
-   PHP (http://www.go4expert.com/forums/php/)
-   -   How to build a search bot (http://www.go4expert.com/forums/build-search-bot-t4699/)

noviceprogrammer 15Jun2007 09:05

How to build a search bot
 
Can someone please tell me how to build a searchbot like the one used for www.pricegrabber.com. It has to be able to data mine selected websites.

Thank you,
novice

shabbir 15Jun2007 09:17

Re: How to build a search bot
 
What is the specialty of that search bot? It just crawls the websites and has some data based on the crawling. I guess you should start on the crawling on the websites and then move on with the parsing of it.

noviceprogrammer 16Jun2007 01:37

Re: How to build a search bot
 
Quote:

Originally Posted by shabbir
What is the specialty of that search bot? It just crawls the websites and has some data based on the crawling. I guess you should start on the crawling on the websites and then move on with the parsing of it.

The bot would search different websites and pull specific information that it would then store in a database. Is this possible? For example if I typed in panasonic 50" tvs it would search circuitcity and store the prices in a data base.

shabbir 16Jun2007 08:51

Re: How to build a search bot
 
So to start with you should concentrate on getting the website pages into your database then you should be parsing them.

noviceprogrammer 16Jun2007 18:36

Re: How to build a search bot
 
Quote:

Originally Posted by shabbir
So to start with you should concentrate on getting the website pages into your database then you should be parsing them.


Alright, thanks shabbir! One problem though, how do I do that? Do you have any links or good examples in mind?

Thanks again, your a great help.

shabbir 17Jun2007 08:50

Re: How to build a search bot
 
You should know the basic of the language i.e opening the file on a web server and getting the HTML of the file ...

noviceprogrammer 19Jun2007 06:21

Re: How to build a search bot
 
Shabbir appreciate your help with this topic I know how to take html and put it into a mysql database and i know how to parse databases with SQL commands but i need a way to constatly keep this information up to date I need something that will automatically on a daily basis be able to parse the data therefore I will not be able to manually upload the files into my database.

shabbir 19Jun2007 08:24

Re: How to build a search bot
 
If you are good doing the HTML parsing then your job is half done. What I meant by HTML parsing is not just putting the whole text into the database but just the visible content only should be there in the database. Something like when you do a select all and copy into the browser.

You will need to have the bots that will cache the data on regular interval using some cron jobs. When any change is found in the database some other program or a cron jobs parse the data. Now the parser should be made on a general basis and not based on the content of the website. Something like if you are looking for price you should find the word price and then look for $.


All times are GMT +5.5. The time now is 16:04.