Parsing HTML in PHP
Parsing HTML has always been a tough cookie even for seasoned programmers, but nowadays parsing HTML is extensively used for scraping websites, crawling, error detection websites, and many other useful purposes. In this article we'll be looking into parsing HTML using PHP, for this purpose I have selected Simple HTML DOM Parser, I found this easier to PHP's own DOMDocument parser, Simple HTML DOM parser let's you work in an object oriented manner, and is much lucid to follow and implement.
Get the Simple HTML DOM parser class PHP file from http://sourceforge.net/projects/simplehtmldom/files/ and save it to any directory of your choice. That's all you need to do.
In a small example we'll include the class, and get all hyperlinks on the go4exert.com homepage.
You can see how easy this was, now can explorer you ideas.
Now, we'll be looking at using selectors to find specific elements, and traversing the DOM tree and such.
Well this should be enough to get you started, you can improvise the method chaining to suit your needs. Enjoy!
|All times are GMT +5.5. The time now is 08:44.|