HTML tags and attributes

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

HTML tags and attributes

Post by alex.barylski »

I've have some regex:

Code: Select all

// Update the meta keywords
$matches = array();
preg_match('%<meta.*?content="(.*?)".+?name="keywords".*?>%si', $page_contents, $matches);
if (isset($matches[1])) {
	$orig_file_contents = preg_replace('%(<meta.*?content=").*?(".+?name="keywords".*?>)%si', '\\1'.$matches[1].'\\2', $orig_file_contents);
}

$matches = array();
preg_match('%<meta.*?name="keywords".+?content="(.*?)".*?>%si', $page_contents, $matches);
if (isset($matches[1])) {
	$orig_file_contents = preg_replace('%(<meta.*?name="keywords".+?content=").*?(".*?>)%si', '\\1'.$matches[1].'\\2', $orig_file_contents);
}
I assume there are two of basically the same thing because it's considering both:

<meta name="keyword" content="Some keywords" />
<meta content="Some keywords" name="keyword" />

Is there a way to join these two into a single statement, I think the doubling of the above two are causing some weird problems with meta tags for keywords, replacing those for descriptions, etc...

Any help appreciated :)
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Post by John Cartwright »

I actually spent about 30 minutes trying to solve this one using lookaheads but failed miserably. You can always be lazy and do something like

Code: Select all

#meta\s+(\w+)="([^"]+)"\s+(\w+)="([^"]+)"#i
Someone definantly come up with something better, that I am sure of.
Ciprian
Forum Newbie
Posts: 4
Joined: Fri Feb 02, 2007 10:15 pm

Post by Ciprian »

A not so nice match can be:

Code: Select all

'/<meta.*(name="keywords")?.*content="([^"]+)".*(name="keywords")?/i'
Post Reply