Removing tags from HTML code with PHP

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
cctrax
Forum Commoner
Posts: 36
Joined: Thu Jul 11, 2002 9:05 pm
Location: United States
Contact:

Removing tags from HTML code with PHP

Post by cctrax »

Question for all your smart people out there.

I want to write a script that I can pass a large amount of HTML though that will remove all of the <a> tags within the file. Possible?
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Post by John Cartwright »

you you have to look at

[php_man]regex[/php_man] which is fairily complicated... i still havnt figured it out :)
cctrax
Forum Commoner
Posts: 36
Joined: Thu Jul 11, 2002 9:05 pm
Location: United States
Contact:

Post by cctrax »

Problem is, all of the <a> tags are different. I want to remove the whole tag...no matter what it linkes to.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

regex can do that.. regex is about the only thing that can do that.. without a LOT of pain on your part..
foeggy
Forum Newbie
Posts: 5
Joined: Wed Aug 04, 2004 12:49 am

Post by foeggy »

I've been working on a script like this for a site I am doin. Gathering information from another webpage and here is what I've come up with. I don't know how efficient it is but it works for the load I have. Maybe you can get some ideas from this.

This first part strips all the <a> tags, no matter what's in between.

Code: Select all

<?php
$value = eregi_replace("<a[^>]*>","",$value);
?>
And to remove the </a> tags..

Code: Select all

<?php
$value = str_replace("</a>","",$value);
?>
You can do that for any of the html tags, just replace what you need.

$value is a string with everything that was gathered form the webpage I grabbed, so these functions search out the <a blah blah> tags and replaces it with "", which in effect, deletes it and replaces it with nothing.

Hope I could help, good luck!

geoff
Post Reply