I'm making a script that checks an url. Grabs the page. Then looks through the page, and looks for links to a specific site. And I want to capture the keywords on those links. Regex is not my strong side. Anyone able to help me out here? First find links where the href equals 'domain.com' then find the keyword(s) that links is linked with using preg_match_all
Keywords are found in a meta tag, like this one HTML: <meta name="keywords" content="php,perl,javascript"> you can get this using the following regex! PHP: $content = 'the html page content'; preg_match('/<meta +name=["\']?keywords["\']? content=["\'](.+)["\'] *>/i',$content,$matches); // $matches contain the matches!
not exactly what i had in mind.. I don't want to find the meta tag keywords. I want to extract all <a> links on a page that links to a given url Then I want to know what keyword is between the <a href="http://domain.com"> and </a> regardless if the link has other attributes like target, class or id. and if it uses " or ' around them.