Regex to extract the text and highlight keyword...HELP!!!

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
User avatar
dibyendrah
Forum Contributor
Posts: 491
Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:

Regex to extract the text and highlight keyword...HELP!!!

Post by dibyendrah »

Dear All,
I'm trying to extract the limited words suppose 10 words before and after the search keyword.

I have done in following way but didn't get the desired result.

Code: Select all

<?php
$str ="PHP This reference covers PHP 4.3’s Perl-style regular expression support contained within the preg routines. PHP4 also provides POSIX-style regular expressions, but these do not offer additional benefit in power or speed. The preg routines use a Traditional NFA match engine. For an explanation of the rules behind an NFA engine, see “Introduction to Regexes and Pattern Matching.”";

$search = "PHP";
$reg_exp = "/[\w{,10}\.\s]*$search(\s)*[\w{,10}\'\"\.\s]*/im";
preg_match_all($reg_exp, $str, $match);
print_r($match);

?>
The above code ouputs following result :

Code: Select all

Array ( [0] => Array ( [0] => PHP This reference covers PHP 4.3 [1] => style regular expression support contained within the preg routines. PHP4 also provides POSIX ) [1] => Array ( [0] => [1] => ) )
I want to match the comma, dash and other character as well.

The pattern that I want is that it should extract the 10 words before and after the search keyword and the keyword to be highlighed.

Any help will be appreciated..

Thank you all.

Dibyendra
Last edited by dibyendrah on Thu Nov 10, 2005 11:32 pm, edited 2 times in total.
User avatar
dibyendrah
Forum Contributor
Posts: 491
Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:

Got some idea after working for few hours

Post by dibyendrah »

I have modified a code a little more which now matches the string and replace the search string.

Code: Select all

<?php

$str ="PHP This reference covers PHP 4.3’s Perl-style regular expression support contained within the preg routines. PHP4 also provides POSIX-style regular expressions, but these do not offer additional benefit in power or speed. The preg routines use a Traditional NFA match engine. For an explanation of the rules behind an NFA engine, see \"Introduction to Regexes and Pattern Matching.\"";

$search = "PHP";
$reg_exp = "/(\b[\w\W]{0,20}(\s)*)($search)((\s)*[\w\W]{0,20}\b)/i";

$replace_reg_exp = "\${1} <b>$search</b> ${3}";

//preg_match($reg_exp, $str, $matches);
//print_r($matches);

$str = preg_replace($reg_exp, $replace_reg_exp, $str);

print($str);
?>
But the output is as follows :

PHP covers PHP regular expression support contained within the preg routines. PHP POSIX-style regular expressions, but these do not offer additional benefit in power or speed. The preg routines use a Traditional NFA match engine. For an explanation of the rules behind an NFA engine, see "Introduction to Regexes and Pattern Matching."

After replacing, the ouput has some words lost besore and after the search string..
Any idea ?

Thank you
User avatar
dibyendrah
Forum Contributor
Posts: 491
Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:

Made a small changes

Post by dibyendrah »

Code: Select all

<?php

$str ="PHP This reference covers PHP 4.3’s Perl-style regular expression support contained within the preg routines. PHP4 also provides POSIX-style regular expressions, but these do not offer additional benefit in power or speed. The preg routines use a Traditional NFA match engine. For an explanation of the rules behind an NFA engine, see \"Introduction to Regexes and Pattern Matching.\"";

$search = "reference";

$reg_exp = "/(\b[\w\W]{0,50}(\s)*)($search)((\s)*[\w\W]{0,50}\b)/i";

//$replace_reg_exp = "\${1} <b>$search</b> ${3}";

preg_match_all($reg_exp, $str, $matches);

$match_arr = $matches[0];

for($i=0; $i<count($match_arr); $i++){
    //$str = preg_replace($reg_exp, $replace_reg_exp, $str);
    $str_matches .= "...".str_replace($search, "<b>".$search."</b>", $match_arr[$i])."...";
}    

print($str_matches);
?>

This made the search keyword bold and displayed atmost 50 character matching before and after the search keyword.

Will it be possible to display 20 words before and after the search keyword and bold the search keyword???

Thank you all!!!

Dibyendra
User avatar
dibyendrah
Forum Contributor
Posts: 491
Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:

Highlighting keywords modified

Post by dibyendrah »

Dear all,
I have modified the code for highigting the keywords. If users input "wood fire", it splits the keywords and higlight the both keywords. But only thing that I was unable to do is to extract the 10 words before and after the keywords . Suppose, the kewords "wood fire" must higlight the wood but it should display the 10 words before and after the keyword and also highlight the fire and display the 10 words before and after the keyword. Please help to enhance this feature. The code snippets that I have done is as follows :

Code: Select all

<?php



$str = <<<EOT

"PHP This reference covers PHP 4.3&#146;s Perl-style regular expression support contained within the preg routines. 



 PHP4 also provides POSIX-style regular expressions, but these do not offer additional benefit in power or speed. 

 

 The preg routines use a Traditional NFA match engine. For an explanation of the rules behind an NFA engine, see \"Introduction to Regexes and Pattern Matching.\"";

EOT;



//$str = stripslashes($str);



$search = "Regexes";



$exclude_array = array("\"","\'",",","-");

$replace_array = array("",""," "," ");



$search = str_replace($exclude_array, $replace_array, $search);



$keywords = explode(" ", $search);



if(($arr_count = count($keywords))>1){

	for($i =0; $i < (count($keywords)-1); $i++){

		$expr .= $keywords[$i]."|";

	}

	$expr .= $keywords[$i];

	$expr = "[".$expr."]";

}else{

	$expr = $search;

}



$color = array(

			   "#999999",//dark grey

			   "#DDDDDD",//light grey

			   "#C0C0C0",

			   "#969696",

			   "#808080",

			   "#646464",

			   "#4B4B4B",

			   "#242424",

			   "#FF66CC"

			   );



$reg_exp = "/(\b[\w\W]{0,50}(\s)*)($expr)((\s)*[\w\W]{0,10}\b)/i";



//$replace_reg_exp = "\${1} <b>$search</b> ${3}";



preg_match_all($reg_exp, $str, $matches);

//print_r($matches);



$match_arr = $matches[0];



for($i=0; $i<count($match_arr); $i++){

    $tmp = $match_arr[$i];

    foreach($keywords as $value){

    	$tmp = "...".str_ireplace($value, "<b><font color=\"".$color[8]."\">".$value."</font></b>", $tmp." ...");

    }

    $str_matches[$i]=$tmp;

}    

//print_r($str_matches);

$display = implode("...", $str_matches);

$display = str_replace(" ......... ", " ... ", $display);

print $display;



?>
Thank you,

Dibyendra

PHP Rules!!!
User avatar
dibyendrah
Forum Contributor
Posts: 491
Joined: Wed Oct 19, 2005 5:14 am
Location: Nepal
Contact:

Any solution ??

Post by dibyendrah »

Dear all,
Any solutions to my problem to extract the words before and after the search keyword?

Hoping to get the solution soon....

see ya,
Dibyendra
Post Reply