Grabbing a string between 2 specific strings

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

Post Reply
User avatar
jayshields
DevNet Resident
Posts: 1912
Joined: Mon Aug 22, 2005 12:11 pm
Location: Leeds/Manchester, England

Grabbing a string between 2 specific strings

Post by jayshields »

I'm quite poor at regex so this is probably easy.

Given two specific strings, all I want to do is grab what's between them from one huge string.

If the two strings are

Code: Select all

<div class=dditd id=dditd><div><b>
and

Code: Select all

&#160;mi</b>
and the whole string is

Code: Select all

rge5g55gdgrdfgdg<div class=dditd id=dditd><div><b>this is what i want&#160;mi</b>55g45txdggdh54y4hh45
then I want to return

Code: Select all

this is what i want
According to http://www.cuneytyilmaz.com/prog/jrx/ this regex should work

Code: Select all

(<div class=dditd id=dditd><div><b>)(.*?)(&#160;mi</b>)
but when I try use that in PHP with preg_match() it returns an empty array. I'm trying it with preg_match() in this form

Code: Select all

$^(<div class=dditd id=dditd><div><b>)(.*?)(&#160;mi</b>)$
as I've gathered it's picking with starting and ending characters.

Can someone point me in the right direction with the expression I need?
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: Grabbing a string between 2 specific strings

Post by prometheuzz »

jayshields wrote:...

Can someone point me in the right direction with the expression I need?
You already have a working expression:

Code: Select all

if(preg_match(
    '@<div class=dditd id=dditd><div><b>(.*?)&#160;mi</b>@', 
    'rge5g55gdgrdfgdg<div class=dditd id=dditd><div><b>this is what i want&#160;mi</b>55g45txdggdh54y4hh45',
    $match
    )) {
  echo $match[1];
}
produces the output: "this is what i want".
You are either not using it properly, or your actual input string differs from the one you posted here. Are there line breaks between "<div class=dditd id=dditd><div><b>" and "&#160;mi</b>" in your actual input?
User avatar
jayshields
DevNet Resident
Posts: 1912
Joined: Mon Aug 22, 2005 12:11 pm
Location: Leeds/Manchester, England

Re: Grabbing a string between 2 specific strings

Post by jayshields »

It was the @ symbols that did it, thanks. What are they for? Just any characters to signify the start and end of the pattern? I tried /^ for start and / for end like in the manual and it didn't work (probably because of / in the pattern) so I escaped the / in the pattern and it still wouldn't work.

What was wrong with the $^ ... $ pattern markers?

Thanks again, and sorry I'm not that clued up on regex!
User avatar
prometheuzz
Forum Regular
Posts: 779
Joined: Fri Apr 04, 2008 5:51 am

Re: Grabbing a string between 2 specific strings

Post by prometheuzz »

jayshields wrote:It was the @ symbols that did it, thanks. What are they for? Just any characters to signify the start and end of the pattern?
Correct.
jayshields wrote:I tried /^ for start and / for end like in the manual and it didn't work (probably because of / in the pattern) so I escaped the / in the pattern and it still wouldn't work.
Then you did something wrong because

Code: Select all

'/<div class=dditd id=dditd><div><b>(.*?)&#160;mi<\/b>/'
will also work (try it with the example I posted).
jayshields wrote:What was wrong with the $^ ... $ pattern markers?
By default, the ^ means the start of the input string and the $ means the end of the input string.
For example, the pattern "^a+$" will match a string consisting of only a's while the pattern "a+" will match one or more a's anywhere in the input string.
jayshields wrote:Thanks again, and sorry I'm not that clued up on regex!
No problem.
Post Reply