Page 1 of 1

HTML tags and attributes

Posted: Thu Feb 01, 2007 4:20 pm
by alex.barylski
I've have some regex:

Code: Select all

// Update the meta keywords
$matches = array();
preg_match('%<meta.*?content="(.*?)".+?name="keywords".*?>%si', $page_contents, $matches);
if (isset($matches[1])) {
	$orig_file_contents = preg_replace('%(<meta.*?content=").*?(".+?name="keywords".*?>)%si', '\\1'.$matches[1].'\\2', $orig_file_contents);
}

$matches = array();
preg_match('%<meta.*?name="keywords".+?content="(.*?)".*?>%si', $page_contents, $matches);
if (isset($matches[1])) {
	$orig_file_contents = preg_replace('%(<meta.*?name="keywords".+?content=").*?(".*?>)%si', '\\1'.$matches[1].'\\2', $orig_file_contents);
}
I assume there are two of basically the same thing because it's considering both:

<meta name="keyword" content="Some keywords" />
<meta content="Some keywords" name="keyword" />

Is there a way to join these two into a single statement, I think the doubling of the above two are causing some weird problems with meta tags for keywords, replacing those for descriptions, etc...

Any help appreciated :)

Posted: Thu Feb 01, 2007 5:24 pm
by John Cartwright
I actually spent about 30 minutes trying to solve this one using lookaheads but failed miserably. You can always be lazy and do something like

Code: Select all

#meta\s+(\w+)="([^"]+)"\s+(\w+)="([^"]+)"#i
Someone definantly come up with something better, that I am sure of.

Posted: Sat Feb 03, 2007 6:04 pm
by Ciprian
A not so nice match can be:

Code: Select all

'/<meta.*(name="keywords")?.*content="([^"]+)".*(name="keywords")?/i'