Page 1 of 1
Replace text between HTML tags
Posted: Wed Mar 03, 2010 7:26 am
by roice
Hello everyone,
My name is Roi and I need your help.
Hope you will be able to help me-
I wrote a script that take variable that contain HTML code,. it replace the exact word "php" (no matter if it is contain capital letters) with the word "asp"
for example, if the variable contain:
Code: Select all
<a href='php.com'>myphp</a> best <b>php</b> website <h1>PhP!!!</h1> php and myphp or phpme - <u>php!</u>!
the result will be:
Code: Select all
<a href='asp.com'>myphp</a> best <b>asp</b> website <h1>asp!!!</h1> aspand myphp or phpme - <u>asp!</u>!
well, the problem is that it althouth replace the letters inside the THML tags and because of that the links changed...
Here is my code:
Code: Select all
function keepcase($word, $replace) {
$replace[0] = (ctype_upper($word[0]) ? strtoupper($replace[0]) : $replace[0]);
return $replace;
}
$text = strtolower(file_get_contents($folder.$file));
$replace = "asp";
$word = "php";
$output = preg_replace('/\b' . preg_quote($word) . '\b/ei', "keepcase('\\0', '$replace')", $text);
echo $output;
What should I change if I want the replacements to be only on the text between the HTML tags?
Thank you in advance,
Roi.
Re: Replace text between HTML tags
Posted: Wed Mar 03, 2010 1:22 pm
by ridgerunner
Unfortunately this cannot be (easily) solved with a single regex. You need to first divide the content into HTMLTAG and non-HTMLTAG chunks, and then apply your replace to only the non-HTMLTAG chunks. I would use the handy preg_replace_callback function like so:
Code: Select all
<?php
function keepcase($word, $replace) {
$replace[0] = (ctype_upper($word[0]) ? strtoupper($replace[0]) : $replace[0]);
return $replace;
}
// regex - match the contents grouping into HTMLTAG and non-HTMLTAG chunks
$re = '%(</?\w++[^<>]*+>) # grab HTML open or close TAG into group 1
| # or...
([^<]*+(?:(?!</?\w++[^<>]*+>)<[^<]*+)*+) # grab non-HTMLTAG text into group 2
%x';
$contents = "<a href='php.com'>myphp</a> best <b>php</b> website <h1>PhP!!!" .
"</h1> php and myphp or phpme - <u>php!</u>!";
// walk through the content, chunk, by chunk, replacing words in non-NTMLTAG chunks only
$contents = preg_replace_callback($re, 'callback_func', $contents);
function callback_func($matches) { // here's the callback function
if ($matches[1]) { // Case 1: this is a HTMLTAG
return $matches[1]; // return HTMLTAG unmodified
}
elseif (isset($matches[2])) { // Case 2: a non-HTMLTAG chunk.
$replace = "asp"; // declare these here
$word = "php"; // or use as global vars?
return preg_replace('/\b' . preg_quote($word) . '\b/ei', "keepcase('\\0', '$replace')",
$matches[2]);
}
exit("Error!"); // never get here
}
echo ($contents);
?>
Hope this helps!

Re: Replace text between HTML tags
Posted: Thu Mar 04, 2010 2:08 am
by roice
hi ridgerunner,
THANK YOU!!!
I have been working on it for the last 3 days...It seem to be working fine!
Can you please send me your massenger email (here or in PM)?
Re: Replace text between HTML tags
Posted: Thu Mar 04, 2010 2:18 am
by roice
By the way - Here is a code that someone else wrote for me. it so the same thing like your code:
Code: Select all
$string = "<a href='http://www.phP.com'>pHp</a> best <b>PHProtem phP MYPHP</b> website <h1>ever!!!</h1> php and myphp or phpme - <u>php!</u>!";
function mamak($matches){
return $matches[1].'asp'.$matches[2];
}
echo preg_replace_callback('/("|>|<|\s)php("|>|<|\s|\!)/i',"mamak",$string);
much more shorter than your

(but who care...)
can you give me your opinion about it?
Re: Replace text between HTML tags
Posted: Thu Mar 04, 2010 6:54 am
by roice
ridgerunner - I changed your code a little bit so It will be abe to receive from the DB list of keywords and your script need to find and replace.
The list located in table called "keywords".
to every keyword have a few synonyms in different table called "synonyms".
So, I change your script to search all the keywords and when keyword has located - to replace it with random synonym. I think that something is missing to me beacuse it's not working well.
Here is the code:
Code: Select all
$content = file_get_contents($folder.$file);
// regex - match the contents grouping into HTMLTAG and non-HTMLTAG chunks
$re = '%(</?\w++[^<>]*+>) # grab HTML open or close TAG into group 1
| # or...
([^<]*+(?:(?!</?\w++[^<>]*+>)<[^<]*+)*+) # grab non-HTMLTAG text into group 2
%x';
$query = mysql_query("SELECT * FROM `keywords` WHERE team_id = $team_id ");
while($index = mysql_fetch_array($query)) {
$keyword = $index['name'];
$key_id = $index['id'];
/ Isolate the keyword from marks - !, ?, .
function keepcase($keyword, $replace) {
$replace[0] = (ctype_upper($keyword[0]) ? strtoupper($replace[0]) : $replace[0]);
return $replace;
}
function callback_func($matches) { // here's the callback function
if ($matches[1]) { // Case 1: this is a HTMLTAG
return $matches[1]; // return HTMLTAG unmodified
}
elseif (isset($matches[2])) { // Case 2: a non-HTMLTAG chunk.
global $key_id;
$query2 = mysql_query("SELECT name FROM `synonyms` WHERE key_id = $key_id ORDER BY RAND() LIMIT 1");
$index = mysql_fetch_array($query2);
$replace = $index['name'];
return preg_replace('/\b' . preg_quote($keyword) . '\b/ei', "keepcase('\\0', '$replace')", $matches[2]);
}
exit("Error!"); // never get here
}
// walk through the content, chunk, by chunk, replacing keywords in non-NTMLTAG chunks only
$content = preg_replace_callback($re, 'callback_func', $content);
echo ($content);
} //End of while
Can you tell me what did I do wrong please?
Re: Replace text between HTML tags
Posted: Thu Mar 04, 2010 9:40 am
by ridgerunner
Glad it helped.
But beware - the crafting of regex solutions can become addicting!

Re: Replace text between HTML tags
Posted: Thu Mar 04, 2010 4:10 pm
by ridgerunner
roice wrote:By the way - Here is a code that someone else wrote for me. it so the same thing like your code:
Code: Select all
$string = "<a href='http://www.phP.com'>pHp</a> best <b>PHProtem phP MYPHP</b> website <h1>ever!!!</h1> php and myphp or phpme - <u>php!</u>!";
function mamak($matches){
return $matches[1].'asp'.$matches[2];
}
echo preg_replace_callback('/("|>|<|\s)php("|>|<|\s|\!)/i',"mamak",$string);
much more shorter than your

(but who care...)
can you give me your opinion about it?
Ok. A couple points...
First, the regex is inefficient - it uses this expression: '("|>|<|\s)', which would be much better specified as a character class like so: '(["><\s])'. Also, there is no need to use a callback function. The above code can be simplified as follows:
Code: Select all
$string = "<a href='http://www.phP.com'>pHp</a> best <b>PHProtem phP MYPHP</b> website <h1>ever!!!</h1> php and myphp or phpme - <u>php!</u>!";
echo preg_replace_callback('/(["><\s])php(["><\s!])/i', '$1asp$2', $string);
Second, this regex works for the specific example string you provided, but does NOT work in the general case. i.e. It fails to match any of the 'PHP' words in the following string:
Code: Select all
$string = "This one should match: php, but doesn't. Same with this: PHP. and what about php? or {php} or
Re: Replace text between HTML tags
Posted: Thu Mar 04, 2010 4:44 pm
by ridgerunner
roice wrote:... Can you tell me what did I do wrong please?
I'm sorry but there are too many problems with your code for me to help you! But here is the major one...
- One DB query loop is nested inside another DB query loop. Can't do this!
Re: Replace text between HTML tags
Posted: Sun Mar 07, 2010 1:49 am
by roice
first of all - Thanks for your reply!
second - these the changes that you wrote, for the code that the other guy wrote, fix the problem that it have?
third - may I please have your messenger email? I want to ask you something and its a little hard in here...
Re: Replace text between HTML tags
Posted: Sun Mar 07, 2010 10:16 am
by ridgerunner
roice wrote:... second - these the changes that you wrote, for the code that the other guy wrote, fix the problem that it have?
I made no functional changes to the code the other guy wrote. I merely cleaned it up a bit. Both his (and my cleaned up version) suffer the same the problem I explained in my second point above.
roice wrote:... third - may I please have your messenger email? I want to ask you something and its a little hard in here...
I sent you a private message. Feel free to do the same!
