Page 1 of 1
matching to closes words?
Posted: Wed Mar 03, 2010 7:05 am
by sergeda
Hi.
I have a string of words like:
A sdfsdf hi sdfsdf sdfsdf hi sdf sdfsdf sdf all asdfsdf ljlksdjf asdfsdf hi sdfasdf sadfsdf hi sdafsdf sdfsdf sdfsdf all.
I need to match two closes words "hi" and "all".
I've tried use "(hi).*?(all)" but it matches all "hi" until "all" and I need only two closes words.
Can somebody help me with this?
Re: matching to closes words?
Posted: Wed Mar 03, 2010 7:12 am
by klevis miho
Try this:
'# [all|hi] [all|hi] #'
Re: matching to closes words?
Posted: Wed Mar 03, 2010 7:30 am
by sergeda
klevis miho wrote:Try this:
'# [all|hi] [all|hi] #'
Thank you Miho but it mathes "a" and "h". Looks like it need to be told in some way to match not by one symbol but by hole word.
Re: matching to closes words?
Posted: Wed Mar 03, 2010 7:58 am
by klevis miho
Yes you're right, try this now:
'# [(all)|(hi)] [(all)|(hi)] #'
Re: matching to closes words?
Posted: Wed Mar 03, 2010 8:11 am
by sergeda
klevis miho wrote:Yes you're right, try this now:
'# [(all)|(hi)] [(all)|(hi)] #'
Well, it doesn't help. Nothing changed.
Re: matching to closes words?
Posted: Wed Mar 03, 2010 8:12 am
by klevis miho
There is a way to group "all" and "hi" but I don't remember well

Re: matching to closes words?
Posted: Wed Mar 03, 2010 8:47 am
by superdezign
To match "hi" or "all", you use this:
However, I'm not sure if that is what the OP is trying to do. He says he wants to match the word on either side of "hi" and on either side of "all". To match the word on either side, you'll want to define what counts as a word. I'll assume that a "word" to you is a series of alphanumeric characters. You can modify the definition as you like.
A series of alphanumeric characters can be represented in regex as:
Assuming that "hi" and "all" are always delimited by whitespace, you can use this:
Code: Select all
([A-Za-z0-9]+)/s+hi/s+([A-Za-z0-9]+).*([A-Za-z0-9]+)/s+all/s+([A-Za-z0-9]+)
This will place the words surrounding "hi" in /1 and /2, and the words surrounding "all" in /3 and /4. This assumes that there is more than one word between "hi" and "all". Personally, I'd do these two regex separately, just in case "hi" and "all" share a word.
So, I'd do this:
Code: Select all
([A-Za-z0-9]+)/s+hi/s+([A-Za-z0-9]+)
Get the words, then do this:
Code: Select all
([A-Za-z0-9]+)/s+all/s+([A-Za-z0-9]+)
And get those words too.
Also, if you want to do as ~klevis was suggesting, to allow either to show up, simply replace "hi" or "all" with "(hi|all)".
Re: matching to closes words?
Posted: Wed Mar 03, 2010 8:54 am
by sergeda
Thank you for trying to help.
Re: matching to closes words?
Posted: Wed Mar 03, 2010 9:06 am
by sergeda
I find the way to say it to find words: \b(hi|all)\b but it doesn't help. "#\b(hi|all)\b \b(hi|all)\b#"
matches only if two words stay close with space between, but I need to match words even when some other words between them.
Re: matching to closes words?
Posted: Wed Mar 03, 2010 10:15 am
by sergeda
Thank you superdezign, missed you reply.
I'll clear what I trying to get. I have a string of characters with words "hi" and "all" in it.
I need to get to closes pars of this word with characters between. But what I got now is first "hi" and all others "hi" with "all".
Like this:
A sdfsdf hi sdfsdf sdfsdf hi sdf sdfsdf sdf all asdfsdf ljlksdjf asdfsdf hi sdfasdf sadfsdf hi sdafsdf sdfsdf sdfsdf all.
Should be:
A sdfsdf hi sdfsdf sdfsdf hi sdf sdfsdf sdf all asdfsdf ljlksdjf asdfsdf hi sdfasdf sadfsdf hi sdafsdf sdfsdf sdfsdf all.
Re: matching to closes words?
Posted: Wed Mar 03, 2010 10:26 am
by klevis miho
'#hi[a-z](.+?)all#i'
with this you get everything what is inside a "hi" and "all"
Re: matching to closes words?
Posted: Wed Mar 03, 2010 12:21 pm
by ridgerunner
Try this: (you need to use negative lookahead)
Code: Select all
// Here is the simple (inefficient) version...
if (preg_match('/\bhi\b((?:(?!\bhi\b).)*?)\ball\b/s', $contents)) {
# Successful match
} else {
# Match attempt failed
}
// Here is the more complex (but much more efficient) version...
if (preg_match('/\bhi\b([^ha]*+(?:(?!\b(?:hi|all)\b).[^ha]*+)*+)\ball\b/s', $contents)) {
# Successful match // Note: to understand this regex, you need to read
} else { // (and understand) chapter 6 of Jeffrey Friedl's
# Match attempt failed // book: "Mastering Regular Expressions - 3rd edition"
}
If you need it to be case-insensitive, add the 'i' modifier.
Hope this helps!
p.s. klevis: You are using square brackets all wrong! (and giving very bad advise!) Please go read the basic documentation at
http://www.regular-expressions.info and pay special attention to
character classes.
Edit 2010-03-04: Changed the first regex to use a lazy rather than greedy star. The greedy version would not match certain strings correctly (i.e. "hi sdafsdf all sdfsdf sdfsdf all". (The second regex is Ok as-is)
Re: matching to closes words?
Posted: Thu Mar 04, 2010 1:15 am
by sergeda
Thank you ridgerunner.
This is definitely what I was looking for.
And thank you for direction with studying.
Re: matching to closes words?
Posted: Thu Mar 04, 2010 11:17 am
by superdezign
I think if you don't do a greedy match, then you shouldn't need a lookahead.
Re: matching to closes words?
Posted: Thu Mar 04, 2010 12:04 pm
by sergeda
Thank you superdezign but with this I've got:
A sdfsdf hi sdfsdf sdfsdf hi sdf sdfsdf sdf all asdfsdf ljlksdjf asdfsdf hi sdfasdf sadfsdf hi sdafsdf sdfsdf sdfsdf all.
Not closes mach.