search html but ignore tags

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
patnet2004
Forum Newbie
Posts: 14
Joined: Sat Jul 19, 2003 3:26 am
Location: Computer Desk

search html but ignore tags

Post by patnet2004 »

i was wondering if there was a function that will let me search through a html document but wont effect anything in the html tags...

ex.

Code: Select all

<?php
$string = "a link <a href="blah.htm">click here</a>";
$string = str_replace("a","",$string);
echo($string);
?>
something like str_replace but wont replace the a's in the:
<a href=\"blah.htm\">click here</a>
patnet2004
Forum Newbie
Posts: 14
Joined: Sat Jul 19, 2003 3:26 am
Location: Computer Desk

Post by patnet2004 »

what im trying to do is change the color of words that i find in a search without messing up the html code...
User avatar
DuFF
Forum Contributor
Posts: 495
Joined: Tue Jun 24, 2003 7:49 pm
Location: USA

Post by DuFF »

Sounds like you need to use a regular expression. I don't know too much about them so I can't really help you there. I'm sure someone else in the forum will be to some assitance.

I have found some good tutorials on starting them though:
http://www.phpbuilder.com/columns/dario19990616.php3
http://www.amk.ca/python/howto/regex/

You will probably need to user preg_replace, here is a tutorial:
http://www.zend.com/manual/function.preg-replace.php

Good luck.
patnet2004
Forum Newbie
Posts: 14
Joined: Sat Jul 19, 2003 3:26 am
Location: Computer Desk

Post by patnet2004 »

i found what i was looking for, and i knew i was going to needed to use a preg function...

Code: Select all

<?php
    /**
	 * Copyright (C) 2003 Eric Bodden
	 * This program is free software; you can redistribute it and/or
	 * modify it under the terms of the GNU General Public License
	 * as published by the Free Software Foundation; either version 2
	 * of the License, or (at your option) any later version.
	 * This program is distributed in the hope that it will be useful,
	 * but WITHOUT ANY WARRANTY; without even the implied warranty of
	 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
	 * GNU General Public License for more details.
	 * You should have received a copy of the GNU General Public License
	 * along with this program; if not, write to the Free Software
	 * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
     *
     * Highlights the searchwords and in the string with the given color.
     * Also takes care that no replacement inside HTML tags is made.
     * @param $search_words array of words to highlight
     * @param $string String to parse
     * @param $bgcolor color or array of colors to highlight with
     *
     * Changelog:
     * V1.0 Initial Version
     * V1.1 Added capability of highlighting in multiple colors.
     *      Thanks to Richard Danby [rdanby@cinoan.com].
     *
     * Example one: applying to a file with default color:
	 *  $fd = fopen ($page, "r");
     *  $size = filesize($page);
     *  $html_text = fread ($fd, $size);
     *  $keywords = explode(" ",$query);
     *  print highlight_search($keywords,$html_text);
     *
     * Example two: applying to a string with array of colors:
	 *  query="this test what";
	 *  $keywords = explode(" ",$query);
	 *  $colors[0]='#FF66CC';
	 *  $colors[1]='#FFFF00';
	 *  $colors[2]='#66FFFF';
	 *  $colors[3]='#99CC00';
	 *  $colors[4]='#9999FF';
	 *  $html_text='<html><body>Lets see if this little test does what I expect this to do.</html></body>';
	 *  print highlight_search($keywords,$html_text,$colors);
     */
	function highlight_search($search_words,$string,$bgcolors='yellow')
	{
		if (is_array($bgcolors)) {
			$no_colors=count($bgcolors);
		} else {
			$temp=$bgcolors;
			unset($bgcolors);
			$bgcolors[0]=$temp;
			$no_colors=1;
		}
		$word_no=0;
		foreach($search_words as $search_word)
		{
		    $regex1 = ">[^<]*(";
		    $regex2 = ")[^<]*<";
		    preg_match_all("/".$regex1.$search_word.$regex2."/i", $string, $matches, PREG_PATTERN_ORDER);
			foreach($matches[0] as $match)
			{
			  preg_match("/$search_word/i", $match, $out);
              $case_sensitive_search_word = $out[0];
			  $newtext = str_replace($case_sensitive_search_word,"<span style="background-color: ".$bgcolors[($word_no % $no_colors)].";">$case_sensitive_search_word</span>", $match);
			  $string = str_replace($match, $newtext, $string);
			}
			$word_no++;
		}
		return $string;
	}

?>
Post Reply