Search result highlighting with excerpts

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
JayBird
Admin
Posts: 4524
Joined: Wed Aug 13, 2003 7:02 am
Location: York, UK
Contact:

Search result highlighting with excerpts

Post by JayBird »

Say i have the folowing string...

Code: Select all

The Tarmac name is one of the most recognised brands in the UK, but few outside our industry realise the breadth of our activities.
 
From our beginnings in the last century, Tarmac has grown to an international operation, providing a wide range of building materials and construction solutions. We are now a market leader, employing 12,500 people in over 500 locations worldwide, with a turnover of £2.1 billion.
 
Of course, we are famous for laying tarmac! Which we do. In a way. But we actually build motorways - from scratch. We quarry on a grand scale. We manufacture and process a wide range of materials for use in all aspects of the construction industry.
A user can search for any term and that term be highlighted. That is easy.

What if the user searched for "Tarmac" and I wanted the results displayed like this, a bit like google...

Search results
The Tarmac name is one of the most recognised....
...in the last century, Tarmac has grown to an international operation, providing...
...we are famous for laying tarmac! Which we do. In a way....
User avatar
jayshields
DevNet Resident
Posts: 1912
Joined: Mon Aug 22, 2005 12:11 pm
Location: Leeds/Manchester, England

Re: Search result highlighting with excerpts

Post by jayshields »

What have you tried? Doesn't seem much more difficult than highlighting the terms to me, although it might be one of those problems that becomes harder once you jump into it and start programming. If you're really stuck, I'll have a shot at it.
User avatar
JayBird
Admin
Posts: 4524
Joined: Wed Aug 13, 2003 7:02 am
Location: York, UK
Contact:

Re: Search result highlighting with excerpts

Post by JayBird »

Yes, it is exactly one of those things that sounded okay, but got really complicated.

This is what i have currently. It only returns once match

Code: Select all

 
function callback($buffer, $search) {
 
    global $search;
    
    // remove anh html from content
    $string = strip_tags($buffer);
    
    // get the index of the search string
    $search_index = strpos($string, $search);
 
    // define our start point and end point
    $start = $search_index - 20;
    $end = strlen($search) + 40;
    
    // highlight the serach term and return brief page summary
    return preg_replace('|('.quotemeta($search).')|iU', '<strong>\\1</strong>', substr($string, $start, $end));
}
 
echo callback($string, "Tarmac");
 
Try running the string in the first post through that function
User avatar
EverLearning
Forum Contributor
Posts: 282
Joined: Sat Feb 23, 2008 3:49 am
Location: Niš, Serbia

Re: Search result highlighting with excerpts

Post by EverLearning »

Try this revised code

Code: Select all

 
function callback($buffer, $search) {
 
    // remove anh html from content
    $string = strip_tags($buffer);
 
    while(($search_index = stripos($string, $search)) !== false) {
 
        // define our start point and end point
        $start = ($search_index - 20) >= 0 ? ($search_index - 20) : 0;
        $end = strlen($search) + 40;
 
    // highlight the serach term and return brief page summary
        $results[] = preg_replace('|('.quotemeta($search).')|iU', '<strong>\\1</strong>', substr($string, $start, $end));
 
        $string = substr($string, $search_index + $end);
    }
 
    return $results;
}
 
$string = "The Tarmac name is one of the most recognised brands in the UK, but few outside our industry realise the breadth of our activities.
 
From our beginnings in the last century, Tarmac has grown to an international operation, providing a wide range of building materials and construction solutions. We are now a market leader, employing 12,500 people in over 500 locations worldwide, with a turnover of £2.1 billion.
 
Of course, we are famous for laying tarmac! Which we do. In a way. But we actually build motorways - from scratch. We quarry on a grand scale. We manufacture and process a wide range of materials for use in all aspects of the construction industry.";
 
var_dump(callback($string, "Tarmac"));
 
and the result is

Code: Select all

array
  0 => string 'The <strong>Tarmac</strong> name is one of the most recognised ' (length=63)
  1 => string 'n the last century, <strong>Tarmac</strong> has grown to an int' (length=63)
  2 => string 'e famous for laying <strong>tarmac</strong>! Which we do. In a ' (length=63)
 
Its needs some work in recognizing the word boundaries, but its a start. This whole thing could probably be done using some fancy regex, and I would like to see that :wink:
User avatar
JayBird
Admin
Posts: 4524
Joined: Wed Aug 13, 2003 7:02 am
Location: York, UK
Contact:

Re: Search result highlighting with excerpts

Post by JayBird »

Holy crap EverLearning, that looks the shiznit!!

I will use that in my application for now unless someone else can come up with anything fancier!

EDIT: is your snippet PHP5 only? Dont think PHP4 has stripos()
User avatar
EverLearning
Forum Contributor
Posts: 282
Joined: Sat Feb 23, 2008 3:49 am
Location: Niš, Serbia

Re: Search result highlighting with excerpts

Post by EverLearning »

Yes, I made it using PHP5, but you can use this(taken from the php manual)

Code: Select all

if (!function_exists("stripos")) {
  function stripos($str,$needle,$offset=0)
  {
      return strpos(strtolower($str),strtolower($needle),$offset);
  }
}
User avatar
JayBird
Admin
Posts: 4524
Joined: Wed Aug 13, 2003 7:02 am
Location: York, UK
Contact:

Re: Search result highlighting with excerpts

Post by JayBird »

Thanks, i will give it a go on Monday when i'm back at work, or over the weekend if im feeling fruity 8)
User avatar
EverLearning
Forum Contributor
Posts: 282
Joined: Sat Feb 23, 2008 3:49 am
Location: Niš, Serbia

Re: Search result highlighting with excerpts

Post by EverLearning »

This version additionaly splits the found strings on space char, so you dont get words cut in the middle:

Code: Select all

function callback($buffer, $search) {
 
    // remove anh html from content
    $string = strip_tags($buffer);
 
    while(($search_index = stripos($string, $search)) !== false) {
 
        // define our start point and end point
        $start = ($search_index - 20) >= 0 ? ($search_index - 20) : 0;
        $end = strlen($search) + 40;
 
        $found = substr($string, $start, $end);
        $found = substr($found, strpos($found, ' '), strrpos($found, ' '));
 
    // highlight the serach term and return brief page summary
        $results[] = preg_replace('|('.quotemeta($search).')|iU', '<strong>\\1</strong>', $found);
 
        $string = substr($string, $search_index + $end);
    }
 
    return $results;
}
Result

Code: Select all

array
  0 => string ' <strong>Tarmac</strong> name is one of the most recognised ' (length=60)
  1 => string ' the last century, <strong>Tarmac</strong> has grown to an ' (length=59)
  2 => string ' famous for laying <strong>tarmac</strong>! Which we do. In a ' (length=62)
 
User avatar
JayBird
Admin
Posts: 4524
Joined: Wed Aug 13, 2003 7:02 am
Location: York, UK
Contact:

Re: Search result highlighting with excerpts

Post by JayBird »

Ooooh, i like that option! Works great.

One little issue

Change the input string to this

Code: Select all

The Tarmac name is one of the most recognised brands in the UK, but few outside our industry realise the breadth of our activities.
 
From our beginnings in the last century, Tarmac has grown to an international operation, providing a wide range of building materials and construction solutions. We are now a market leader, employing 12,500 people in over 500 locations worldwide, with a turnover of £2.1 billion.
 
Of course, we are famous for laying tarmac! Which we do. In a way. But we actually build motorways - from scratch. We quarry on a grand scale. We manufacture and process a wide range of materials for use in all aspects of the construction industry.
 
Tarmac is in search of managers and leaders of the future. If you are ambitious, resourceful and have a positive approach to what you do, find out more and apply at: http://www.tarmac.co.uk/gradlife
It seems to do something funky with the url at the end.

Any ideas?
User avatar
EverLearning
Forum Contributor
Posts: 282
Joined: Sat Feb 23, 2008 3:49 am
Location: Niš, Serbia

Re: Search result highlighting with excerpts

Post by EverLearning »

It was funky at the end beacuse the last found string section had a space at position 4, and last space at position 8(I used spaces to find word boundaries), so all you got was

Code: Select all

at: ht
or somethig like it. I fixed it so that if the last space is before the position of $search(which caused this bug), it will just use the whole string section

Code: Select all

function callback($buffer, $search) {
 
    // remove anh html from content
    $string = strip_tags($buffer);
 
    while(($search_index = stripos($string, $search)) !== false) {
 
        // define our start point and end point
        $start = ($search_index - 20) >= 0 ? ($search_index - 20) : 0;
        $end = strlen($search) + 40;
 
        $found = substr($string, $start, $end);
 
        $begin = strpos($found, ' ');
        $finish = strrpos($found, ' ');
        $finish = (stripos($found, $search) > $finish) ? $start - $end : $finish ;
 
        $found = substr($found, $begin, $finish);
 
    // highlight the serach term and return brief page summary
        $results[] = preg_replace('|('.quotemeta($search).')|iU', '<strong>\\1</strong>', $found);
 
        $string = substr($string, $search_index + $end);
    }
 
    return $results;
}
User avatar
JayBird
Admin
Posts: 4524
Joined: Wed Aug 13, 2003 7:02 am
Location: York, UK
Contact:

Re: Search result highlighting with excerpts

Post by JayBird »

Sweeeeeeet, working really well

Im guessing it would be really complicated to expand this to multi-word searches?
User avatar
EverLearning
Forum Contributor
Posts: 282
Joined: Sat Feb 23, 2008 3:49 am
Location: Niš, Serbia

Re: Search result highlighting with excerpts

Post by EverLearning »

If by multi-word you mean search like "sql query" this function will find lines where this two words are side by side. I tested it on Maugrim's tutorial :D here on the forum

Code: Select all

$content = file_get_contents('http://forums.devnetwork.net/viewtopic.php?f=28&t=48499');
 
var_dump(callback($content, "SQL query"));


But if you want multi-word search where words don't have to be adjacent to one another, you're better off with solutions like Zend_Search_Lucene and similar.
User avatar
RobertGonzalez
Site Administrator
Posts: 14293
Joined: Tue Sep 09, 2003 6:04 pm
Location: Fremont, CA, USA

Re: Search result highlighting with excerpts

Post by RobertGonzalez »

When I get to work on Monday I will search through some code I wrote that does this exact thing with multi word search highlighting.
User avatar
jayshields
DevNet Resident
Posts: 1912
Joined: Mon Aug 22, 2005 12:11 pm
Location: Leeds/Manchester, England

Re: Search result highlighting with excerpts

Post by jayshields »

Nice work EverLearning, I will probably find use for that function some time in the future!
User avatar
EverLearning
Forum Contributor
Posts: 282
Joined: Sat Feb 23, 2008 3:49 am
Location: Niš, Serbia

Re: Search result highlighting with excerpts

Post by EverLearning »

I just played a little with his original code ;)
Post Reply