matching to closes words?

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
sergeda
Forum Newbie
Posts: 8
Joined: Wed Mar 03, 2010 6:56 am

matching to closes words?

Post by sergeda »

Hi.

I have a string of words like:
A sdfsdf hi sdfsdf sdfsdf hi sdf sdfsdf sdf all asdfsdf ljlksdjf asdfsdf hi sdfasdf sadfsdf hi sdafsdf sdfsdf sdfsdf all.
I need to match two closes words "hi" and "all".
I've tried use "(hi).*?(all)" but it matches all "hi" until "all" and I need only two closes words.
Can somebody help me with this?
klevis miho
Forum Contributor
Posts: 413
Joined: Wed Oct 29, 2008 2:59 pm
Location: Albania
Contact:

Re: matching to closes words?

Post by klevis miho »

Try this:
'# [all|hi] [all|hi] #'
sergeda
Forum Newbie
Posts: 8
Joined: Wed Mar 03, 2010 6:56 am

Re: matching to closes words?

Post by sergeda »

klevis miho wrote:Try this:
'# [all|hi] [all|hi] #'
Thank you Miho but it mathes "a" and "h". Looks like it need to be told in some way to match not by one symbol but by hole word.
klevis miho
Forum Contributor
Posts: 413
Joined: Wed Oct 29, 2008 2:59 pm
Location: Albania
Contact:

Re: matching to closes words?

Post by klevis miho »

Yes you're right, try this now:

'# [(all)|(hi)] [(all)|(hi)] #'
sergeda
Forum Newbie
Posts: 8
Joined: Wed Mar 03, 2010 6:56 am

Re: matching to closes words?

Post by sergeda »

klevis miho wrote:Yes you're right, try this now:

'# [(all)|(hi)] [(all)|(hi)] #'
Well, it doesn't help. Nothing changed.
klevis miho
Forum Contributor
Posts: 413
Joined: Wed Oct 29, 2008 2:59 pm
Location: Albania
Contact:

Re: matching to closes words?

Post by klevis miho »

There is a way to group "all" and "hi" but I don't remember well :(
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Re: matching to closes words?

Post by superdezign »

To match "hi" or "all", you use this:

Code: Select all

(hi|all)
However, I'm not sure if that is what the OP is trying to do. He says he wants to match the word on either side of "hi" and on either side of "all". To match the word on either side, you'll want to define what counts as a word. I'll assume that a "word" to you is a series of alphanumeric characters. You can modify the definition as you like.

A series of alphanumeric characters can be represented in regex as:

Code: Select all

[A-Za-z0-9]+
Assuming that "hi" and "all" are always delimited by whitespace, you can use this:

Code: Select all

([A-Za-z0-9]+)/s+hi/s+([A-Za-z0-9]+).*([A-Za-z0-9]+)/s+all/s+([A-Za-z0-9]+)
This will place the words surrounding "hi" in /1 and /2, and the words surrounding "all" in /3 and /4. This assumes that there is more than one word between "hi" and "all". Personally, I'd do these two regex separately, just in case "hi" and "all" share a word.

So, I'd do this:

Code: Select all

([A-Za-z0-9]+)/s+hi/s+([A-Za-z0-9]+)
Get the words, then do this:

Code: Select all

([A-Za-z0-9]+)/s+all/s+([A-Za-z0-9]+)
And get those words too.

Also, if you want to do as ~klevis was suggesting, to allow either to show up, simply replace "hi" or "all" with "(hi|all)".
sergeda
Forum Newbie
Posts: 8
Joined: Wed Mar 03, 2010 6:56 am

Re: matching to closes words?

Post by sergeda »

Thank you for trying to help.
sergeda
Forum Newbie
Posts: 8
Joined: Wed Mar 03, 2010 6:56 am

Re: matching to closes words?

Post by sergeda »

I find the way to say it to find words: \b(hi|all)\b but it doesn't help. "#\b(hi|all)\b \b(hi|all)\b#"
matches only if two words stay close with space between, but I need to match words even when some other words between them.
sergeda
Forum Newbie
Posts: 8
Joined: Wed Mar 03, 2010 6:56 am

Re: matching to closes words?

Post by sergeda »

Thank you superdezign, missed you reply.
I'll clear what I trying to get. I have a string of characters with words "hi" and "all" in it.
I need to get to closes pars of this word with characters between. But what I got now is first "hi" and all others "hi" with "all".
Like this:
A sdfsdf hi sdfsdf sdfsdf hi sdf sdfsdf sdf all asdfsdf ljlksdjf asdfsdf hi sdfasdf sadfsdf hi sdafsdf sdfsdf sdfsdf all.
Should be:
A sdfsdf hi sdfsdf sdfsdf hi sdf sdfsdf sdf all asdfsdf ljlksdjf asdfsdf hi sdfasdf sadfsdf hi sdafsdf sdfsdf sdfsdf all.
klevis miho
Forum Contributor
Posts: 413
Joined: Wed Oct 29, 2008 2:59 pm
Location: Albania
Contact:

Re: matching to closes words?

Post by klevis miho »

'#hi[a-z](.+?)all#i'

with this you get everything what is inside a "hi" and "all"
User avatar
ridgerunner
Forum Contributor
Posts: 214
Joined: Sun Jul 05, 2009 10:39 pm
Location: SLC, UT

Re: matching to closes words?

Post by ridgerunner »

Try this: (you need to use negative lookahead)

Code: Select all

// Here is the simple (inefficient) version...
if (preg_match('/\bhi\b((?:(?!\bhi\b).)*?)\ball\b/s', $contents)) {
    # Successful match
} else {
    # Match attempt failed
}
 
// Here is the more complex (but much more efficient) version...
if (preg_match('/\bhi\b([^ha]*+(?:(?!\b(?:hi|all)\b).[^ha]*+)*+)\ball\b/s', $contents)) {
    # Successful match        // Note: to understand this regex, you need to read
} else {                      //   (and understand) chapter 6 of Jeffrey Friedl's
    # Match attempt failed    //    book: "Mastering Regular Expressions - 3rd edition"
}
 
If you need it to be case-insensitive, add the 'i' modifier.

Hope this helps! :)

p.s. klevis: You are using square brackets all wrong! (and giving very bad advise!) Please go read the basic documentation at http://www.regular-expressions.info and pay special attention to character classes.

Edit 2010-03-04: Changed the first regex to use a lazy rather than greedy star. The greedy version would not match certain strings correctly (i.e. "hi sdafsdf all sdfsdf sdfsdf all". (The second regex is Ok as-is)
Last edited by ridgerunner on Thu Mar 04, 2010 3:22 pm, edited 1 time in total.
sergeda
Forum Newbie
Posts: 8
Joined: Wed Mar 03, 2010 6:56 am

Re: matching to closes words?

Post by sergeda »

Thank you ridgerunner.
This is definitely what I was looking for.
And thank you for direction with studying.
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Re: matching to closes words?

Post by superdezign »

I think if you don't do a greedy match, then you shouldn't need a lookahead.

Code: Select all

\bhi\b(.+?)\ball\b
sergeda
Forum Newbie
Posts: 8
Joined: Wed Mar 03, 2010 6:56 am

Re: matching to closes words?

Post by sergeda »

Thank you superdezign but with this I've got:
A sdfsdf hi sdfsdf sdfsdf hi sdf sdfsdf sdf all asdfsdf ljlksdjf asdfsdf hi sdfasdf sadfsdf hi sdafsdf sdfsdf sdfsdf all.
Not closes mach.
Post Reply