Super simple question

Any questions involving matching text strings to patterns - the pattern is called a "regular expression."

Moderator: General Moderators

ivanfx
Forum Newbie
Posts: 14
Joined: Sun Jul 01, 2007 3:47 am

Super simple question

Post by ivanfx »

Hello,
I need a little regex to help me extract content from a div.. :oops:

<div id="divname">
</div>

Within this div is the data I need.

PS Could you include the php code too?
Like preg_match ... :roll:

Thanks!
User avatar
Benjamin
Site Administrator
Posts: 6935
Joined: Sun May 19, 2002 10:24 pm

Post by Benjamin »

That would be writing code.

We only help.

What have you tried?
miro_igov
Forum Contributor
Posts: 485
Joined: Fri Mar 31, 2006 5:06 am
Location: Bulgaria

Post by miro_igov »

What data is inside the div? Will it contain html tags or not?Do you have only 1 <div id="divname"> or any number of?
ivanfx
Forum Newbie
Posts: 14
Joined: Sun Jul 01, 2007 3:47 am

Post by ivanfx »

The page itself contains loads of divs, but only one has that id value.
The div itself consists of a lot of tags, AdSense ads and other divs.

I need the data inside that goes in this order:

<h4><a href="">link text</a></h4>
<p class="something">some text</br>
<a class="newclass">some more text</a</p>

and then it just repeats 10 times....

:roll:
miro_igov
Forum Contributor
Posts: 485
Joined: Fri Mar 31, 2006 5:06 am
Location: Bulgaria

Post by miro_igov »

then you can check explode()
ivanfx
Forum Newbie
Posts: 14
Joined: Sun Jul 01, 2007 3:47 am

Post by ivanfx »

I tried, but I can't get it to work..
How do I use explode to keep these 3 lines intact (together)?
miro_igov
Forum Contributor
Posts: 485
Joined: Fri Mar 31, 2006 5:06 am
Location: Bulgaria

Post by miro_igov »

You need to read this: http://php.net/manual/en/function.explode.php

Explode the content with separator your <div id="something"> then explode again element 1 in the result of the first explode to </div>.

This only works if you don't have any </div> inside the data you need.
ivanfx
Forum Newbie
Posts: 14
Joined: Sun Jul 01, 2007 3:47 am

Post by ivanfx »

Thanks, I'll give it a try! :D
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Post by superdezign »

miro_igov wrote:You need to read this: http://php.net/manual/en/function.explode.php

Explode the content with separator your <div id="something"> then explode again element 1 in the result of the first explode to </div>.

This only works if you don't have any </div> inside the data you need.
Eww. That's definitely not what explode should be used for. That would create two arrays, and do so very strangely. This should definitely be done with regex.

If you know the start and end of what you want, just add a ".*" to the middle of those in the regex and viola. It's simple.
ivanfx
Forum Newbie
Posts: 14
Joined: Sun Jul 01, 2007 3:47 am

Post by ivanfx »

miro_igov wrote:You need to read this: http://php.net/manual/en/function.explode.php

Explode the content with separator your <div id="something"> then explode again element 1 in the result of the first explode to </div>.

This only works if you don't have any </div> inside the data you need.
Thnks for the tip! I'm almost there!

I just need an advice on how preg_replace works?

I need to remove two elements. One is a <p style=".....">Some content here</p> and
the other one is the <div id="footer"></a>. The </a> element was left during explode..

Anyway, I'm trying
$result = preg_replace('/<p style=".....">.*<\/p>/', ' ', $content);

but it's still there :(
miro_igov
Forum Contributor
Posts: 485
Joined: Fri Mar 31, 2006 5:06 am
Location: Bulgaria

Post by miro_igov »

superdezign wrote:
miro_igov wrote:You need to read this: http://php.net/manual/en/function.explode.php

Explode the content with separator your <div id="something"> then explode again element 1 in the result of the first explode to </div>.

This only works if you don't have any </div> inside the data you need.
Eww. That's definitely not what explode should be used for. That would create two arrays, and do so very strangely. This should definitely be done with regex.

If you know the start and end of what you want, just add a ".*" to the middle of those in the regex and viola. It's simple.
You are wrong, if you put /<div>(.*)<\/div>/ you will get the content between the first <div> and the last</div>
miro_igov
Forum Contributor
Posts: 485
Joined: Fri Mar 31, 2006 5:06 am
Location: Bulgaria

Post by miro_igov »

ivanfx wrote:
miro_igov wrote:You need to read this: http://php.net/manual/en/function.explode.php

Explode the content with separator your <div id="something"> then explode again element 1 in the result of the first explode to </div>.

This only works if you don't have any </div> inside the data you need.
Thnks for the tip! I'm almost there!

I just need an advice on how preg_replace works?

I need to remove two elements. One is a <p style=".....">Some content here</p> and
the other one is the <div id="footer"></a>. The </a> element was left during explode..

Anyway, I'm trying
$result = preg_replace('/<p style=".....">.*<\/p>/', ' ', $content);

but it's still there :(
What is in the dots style="....." , if you really use these dots to detect dots you need to escape them with \. . The above example will replace everything from the first <p style="....."> to the last </p>, but if ht content in <p> does not contain html tags you may use /<p style=".....">[^<]*<\/p>/
ivanfx
Forum Newbie
Posts: 14
Joined: Sun Jul 01, 2007 3:47 am

Post by ivanfx »

Sorry for the dots, it's just some css inside. So:

<p style="float:left">AdSense script </p>
User avatar
superdezign
DevNet Master
Posts: 4135
Joined: Sat Jan 20, 2007 11:06 pm

Post by superdezign »

miro_igov wrote:
superdezign wrote:
miro_igov wrote:You need to read this: http://php.net/manual/en/function.explode.php

Explode the content with separator your <div id="something"> then explode again element 1 in the result of the first explode to </div>.

This only works if you don't have any </div> inside the data you need.
Eww. That's definitely not what explode should be used for. That would create two arrays, and do so very strangely. This should definitely be done with regex.

If you know the start and end of what you want, just add a ".*" to the middle of those in the regex and viola. It's simple.
You are wrong, if you put /<div>(.*)<\/div>/ you will get the content between the first <div> and the last</div>
:?:
I said "If you know the start and end of what you want," not "use <div>(.*)</div>." And I stand by not using explode for this. Explode is for generating arrays, not finding data.

Code: Select all

preg_match('#<div id="foo">(.*?)</div>#', $data, $match);
miro_igov
Forum Contributor
Posts: 485
Joined: Fri Mar 31, 2006 5:06 am
Location: Bulgaria

Post by miro_igov »

superdezign wrote:
miro_igov wrote:
superdezign wrote: Eww. That's definitely not what explode should be used for. That would create two arrays, and do so very strangely. This should definitely be done with regex.

If you know the start and end of what you want, just add a ".*" to the middle of those in the regex and viola. It's simple.
You are wrong, if you put /<div>(.*)<\/div>/ you will get the content between the first <div> and the last</div>
:?:
I said "If you know the start and end of what you want," not "use <div>(.*)</div>." And I stand by not using explode for this. Explode is for generating arrays, not finding data.

Code: Select all

preg_match('#<div id="foo">(.*?)</div>#', $data, $match);

But dear,

what will happen if he have (and he does!)

Code: Select all

<div id="foo"> Hello Wordl <p>Some info here</p><div id="bar">Ouch this is bad div</div> </div> And some other html here bla bla <div>another bad div, ouch ouch</div>

Your suggestion will match this: Hello Wordl <p>Some info here</p><div id="bar">Ouch this is bad div</div> </div> And some other html here bla bla <div>another bad div, ouch ouch
Post Reply