How to build a search bot

Discussion in 'PHP' started by noviceprogrammer, Jun 15, 2007.

  1. noviceprogrammer

    noviceprogrammer New Member

    Joined:
    Jun 15, 2007
    Messages:
    4
    Likes Received:
    0
    Trophy Points:
    0
    Can someone please tell me how to build a searchbot like the one used for www.pricegrabber.com. It has to be able to data mine selected websites.

    Thank you,
    novice
     
  2. shabbir

    shabbir Administrator Staff Member

    Joined:
    Jul 12, 2004
    Messages:
    15,375
    Likes Received:
    388
    Trophy Points:
    83
    What is the specialty of that search bot? It just crawls the websites and has some data based on the crawling. I guess you should start on the crawling on the websites and then move on with the parsing of it.
     
  3. noviceprogrammer

    noviceprogrammer New Member

    Joined:
    Jun 15, 2007
    Messages:
    4
    Likes Received:
    0
    Trophy Points:
    0
    The bot would search different websites and pull specific information that it would then store in a database. Is this possible? For example if I typed in panasonic 50" tvs it would search circuitcity and store the prices in a data base.
     
  4. shabbir

    shabbir Administrator Staff Member

    Joined:
    Jul 12, 2004
    Messages:
    15,375
    Likes Received:
    388
    Trophy Points:
    83
    So to start with you should concentrate on getting the website pages into your database then you should be parsing them.
     
  5. noviceprogrammer

    noviceprogrammer New Member

    Joined:
    Jun 15, 2007
    Messages:
    4
    Likes Received:
    0
    Trophy Points:
    0

    Alright, thanks shabbir! One problem though, how do I do that? Do you have any links or good examples in mind?

    Thanks again, your a great help.
     
  6. shabbir

    shabbir Administrator Staff Member

    Joined:
    Jul 12, 2004
    Messages:
    15,375
    Likes Received:
    388
    Trophy Points:
    83
    You should know the basic of the language i.e opening the file on a web server and getting the HTML of the file ...
     
  7. noviceprogrammer

    noviceprogrammer New Member

    Joined:
    Jun 15, 2007
    Messages:
    4
    Likes Received:
    0
    Trophy Points:
    0
    Shabbir appreciate your help with this topic I know how to take html and put it into a mysql database and i know how to parse databases with SQL commands but i need a way to constatly keep this information up to date I need something that will automatically on a daily basis be able to parse the data therefore I will not be able to manually upload the files into my database.
     
  8. shabbir

    shabbir Administrator Staff Member

    Joined:
    Jul 12, 2004
    Messages:
    15,375
    Likes Received:
    388
    Trophy Points:
    83
    If you are good doing the HTML parsing then your job is half done. What I meant by HTML parsing is not just putting the whole text into the database but just the visible content only should be there in the database. Something like when you do a select all and copy into the browser.

    You will need to have the bots that will cache the data on regular interval using some cron jobs. When any change is found in the database some other program or a cron jobs parse the data. Now the parser should be made on a general basis and not based on the content of the website. Something like if you are looking for price you should find the word price and then look for $.
     

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice