Nextri 23Jun2007 15:33

Need regex help
I'm making a script that checks an url. Grabs the page. Then looks through the page, and looks for links to a specific site. And I want to capture the keywords on those links.

Regex is not my strong side. Anyone able to help me out here?

First find links where the href equals 'domain.com'

then find the keyword(s) that links is linked with

using preg_match_all

pradeep 24Jun2007 18:30

Re: Need regex help
Keywords are found in a meta tag, like this one

HTML Code:

<meta name="keywords" content="php,perl,javascript">
you can get this using the following regex!

PHP Code:

 $content 'the html page content';
preg_match('/<meta +name=["\']?keywords["\']? content=["\'](.+)["\'] *>/i',$content,$matches);
// $matches contain the matches! 

Nextri 26Jun2007 19:13

Re: Need regex help
not exactly what i had in mind..

I don't want to find the meta tag keywords.
I want to extract all <a> links on a page that links to a given url
Then I want to know what keyword is between the <a href="http://domain.com"> and </a>

regardless if the link has other attributes like target, class or id. and if it uses " or ' around them.

pradeep 26Jun2007 19:21

Re: Need regex help
Well, then use this http://php-html.sourceforge.net/

