Page 1 of 1

RegExp: 50 words before and 50 words after a match

Posted: Sun Feb 13, 2005 5:28 am
by visionmaster
Hello,

I'm looking for a pattern which finds a word in a text and returns the word and 50 words before and 50 words after the found word.

Finding 50 characters before and after is no problem:
"|(.{50})".$strWord."(.{50})|is"


Thanks!

Posted: Sun Feb 13, 2005 7:29 am
by Chris Corbyn
Not tested...

Code: Select all

I removed this regexp cos it was useless ;-)
Add as many punctuation marks into the square brackets as you may find stuck to the end of a word...

EDIT: This is better

Code: Select all

'/(ї\S]{50}\s'.$word.'\sї\S]{50})/i'
It'll find anything that isn't whitespace up until a space or tab or newline etc 50 times before the word and 50 times after (case insensitive)

Posted: Sun Feb 13, 2005 9:08 am
by feyd
d11's requires a minimum of 101 "words" in the string. If anything, should make it flexible in where the word can be found.. so switch to {0,50}

Posted: Sun Feb 13, 2005 9:19 am
by Chris Corbyn
Good point ;-)

Posted: Sun Feb 13, 2005 10:55 am
by timvw
If i'm not mistaken:
"\<" matches the null string at the start of a word.
"\>" matches the null string at the end of the word.

thus leads to something like:

#(\<.*?\>){0, 50}$word(\<.*?\>){0, 50}#

Posted: Sun Feb 13, 2005 10:59 am
by feyd
\b is also a word-break trigger. Not sure if that'd work though tim. Can always test it! :)

Posted: Sun Feb 13, 2005 11:11 am
by timvw
still drunk from party last night.... forget what i said :)