Trying to remove any HTML head and body information

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
User avatar
hismightiness
Forum Newbie
Posts: 4
Joined: Sun Aug 17, 2003 10:36 pm
Location: Florida

Trying to remove any HTML head and body information

Post by hismightiness »

I do not develop in PHP as often as most of you, so I am hoping that someone might help me with this, as I am not sure that this is the best way to go about it. I am trying to remove ALL information in the HEAD of the HTML page, as well as the BODY tag and closing BODY & HTML tags. Here is what I am attempting to do:

Code: Select all

# $FileText is the HTML content loaded into a variable

	# remove any HTML body and header information for rewrite
	if(ereg("(<body[^>])",$FileText,$BodyMatch)){
		$arrFileText = explode($BodyMatch[1],$FileText);
		$FileText = str_replace('</body>','',$arrFileText[2]);
		$FileText = str_replace('</html>','',$FileText);
	}

	# here I would use $FileText to do the rest of my needed work
Is this a consistent and/or viable approach? This has not gone through extensive testing yet, but seems to be working so far.
guanxin
Forum Newbie
Posts: 4
Joined: Wed Jul 26, 2006 6:20 am

Post by guanxin »

does this code do the same work?

Code: Select all

$str = "<html> 
<head><title>Hello, world!</title></head> 
<body> 
This is the text to save. 
</body> 
</html>";
$str = preg_replace("/.*?<body>(.*?)<\/body>.*/si", "\\1", $str);
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Almost. But what if the body tag has attributes? ;-)
guanxin
Forum Newbie
Posts: 4
Joined: Wed Jul 26, 2006 6:20 am

Post by guanxin »

Ambush Commander wrote:Almost. But what if the body tag has attributes? ;-)
?

Code: Select all

/.*?<body[^>]*?>(.*)<\/body>.*/si
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Yep. Although personally speaking, I'd use preg_match.
Post Reply