Page 1 of 2
Super simple question
Posted: Sun Jul 01, 2007 3:53 am
by ivanfx
Hello,
I need a little regex to help me extract content from a div..
<div id="divname">
</div>
Within this div is the data I need.
PS Could you include the php code too?
Like preg_match ...
Thanks!
Posted: Sun Jul 01, 2007 3:59 am
by Benjamin
That would be writing code.
We only help.
What have you tried?
Posted: Sun Jul 01, 2007 3:59 am
by miro_igov
What data is inside the div? Will it contain html tags or not?Do you have only 1 <div id="divname"> or any number of?
Posted: Sun Jul 01, 2007 4:24 am
by ivanfx
The page itself contains loads of divs, but only one has that id value.
The div itself consists of a lot of tags, AdSense ads and other divs.
I need the data inside that goes in this order:
<h4><a href="">link text</a></h4>
<p class="something">some text</br>
<a class="newclass">some more text</a</p>
and then it just repeats 10 times....

Posted: Sun Jul 01, 2007 4:35 am
by miro_igov
then you can check explode()
Posted: Sun Jul 01, 2007 4:42 am
by ivanfx
I tried, but I can't get it to work..
How do I use explode to keep these 3 lines intact (together)?
Posted: Sun Jul 01, 2007 4:46 am
by miro_igov
You need to read this:
http://php.net/manual/en/function.explode.php
Explode the content with separator your <div id="something"> then explode again element 1 in the result of the first explode to </div>.
This only works if you don't have any </div> inside the data you need.
Posted: Sun Jul 01, 2007 4:54 am
by ivanfx
Thanks, I'll give it a try!

Posted: Sun Jul 01, 2007 6:06 am
by superdezign
miro_igov wrote:You need to read this:
http://php.net/manual/en/function.explode.php
Explode the content with separator your <div id="something"> then explode again element 1 in the result of the first explode to </div>.
This only works if you don't have any </div> inside the data you need.
Eww. That's definitely not what explode should be used for. That would create two arrays, and do so very strangely. This should definitely be done with regex.
If you know the start and end of what you want, just add a ".*" to the middle of those in the regex and viola. It's simple.
Posted: Sun Jul 01, 2007 6:22 am
by ivanfx
miro_igov wrote:You need to read this:
http://php.net/manual/en/function.explode.php
Explode the content with separator your <div id="something"> then explode again element 1 in the result of the first explode to </div>.
This only works if you don't have any </div> inside the data you need.
Thnks for the tip! I'm almost there!
I just need an advice on how preg_replace works?
I need to remove two elements. One is a <p style=".....">Some content here</p> and
the other one is the <div id="footer"></a>. The </a> element was left during explode..
Anyway, I'm trying
$result = preg_replace('/<p style=".....">.*<\/p>/', ' ', $content);
but it's still there

Posted: Sun Jul 01, 2007 7:59 am
by miro_igov
superdezign wrote:miro_igov wrote:You need to read this:
http://php.net/manual/en/function.explode.php
Explode the content with separator your <div id="something"> then explode again element 1 in the result of the first explode to </div>.
This only works if you don't have any </div> inside the data you need.
Eww. That's definitely not what explode should be used for. That would create two arrays, and do so very strangely. This should definitely be done with regex.
If you know the start and end of what you want, just add a ".*" to the middle of those in the regex and viola. It's simple.
You are wrong, if you put /<div>(.*)<\/div>/ you will get the content between the first <div> and the last</div>
Posted: Sun Jul 01, 2007 8:04 am
by miro_igov
ivanfx wrote:miro_igov wrote:You need to read this:
http://php.net/manual/en/function.explode.php
Explode the content with separator your <div id="something"> then explode again element 1 in the result of the first explode to </div>.
This only works if you don't have any </div> inside the data you need.
Thnks for the tip! I'm almost there!
I just need an advice on how preg_replace works?
I need to remove two elements. One is a <p style=".....">Some content here</p> and
the other one is the <div id="footer"></a>. The </a> element was left during explode..
Anyway, I'm trying
$result = preg_replace('/<p style=".....">.*<\/p>/', ' ', $content);
but it's still there

What is in the dots style="....." , if you really use these dots to detect dots you need to escape them with \. . The above example will replace everything from the first <p style="....."> to the last </p>, but if ht content in <p> does not contain html tags you may use /<p style=".....">[^<]*<\/p>/
Posted: Sun Jul 01, 2007 8:11 am
by ivanfx
Sorry for the dots, it's just some css inside. So:
<p style="float:left">AdSense script </p>
Posted: Sun Jul 01, 2007 8:17 am
by superdezign
miro_igov wrote:superdezign wrote:miro_igov wrote:You need to read this:
http://php.net/manual/en/function.explode.php
Explode the content with separator your <div id="something"> then explode again element 1 in the result of the first explode to </div>.
This only works if you don't have any </div> inside the data you need.
Eww. That's definitely not what explode should be used for. That would create two arrays, and do so very strangely. This should definitely be done with regex.
If you know the start and end of what you want, just add a ".*" to the middle of those in the regex and viola. It's simple.
You are wrong, if you put /<div>(.*)<\/div>/ you will get the content between the first <div> and the last</div>

I said "If you know the start and end of what you want," not "use <div>(.*)</div>." And I stand by not using explode for this. Explode is for generating arrays, not finding data.
Code: Select all
preg_match('#<div id="foo">(.*?)</div>#', $data, $match);
Posted: Sun Jul 01, 2007 8:31 am
by miro_igov
superdezign wrote:miro_igov wrote:superdezign wrote:
Eww. That's definitely not what explode should be used for. That would create two arrays, and do so very strangely. This should definitely be done with regex.
If you know the start and end of what you want, just add a ".*" to the middle of those in the regex and viola. It's simple.
You are wrong, if you put /<div>(.*)<\/div>/ you will get the content between the first <div> and the last</div>

I said "If you know the start and end of what you want," not "use <div>(.*)</div>." And I stand by not using explode for this. Explode is for generating arrays, not finding data.
Code: Select all
preg_match('#<div id="foo">(.*?)</div>#', $data, $match);
But dear,
what will happen if he have (and he does!)
Code: Select all
<div id="foo"> Hello Wordl <p>Some info here</p><div id="bar">Ouch this is bad div</div> </div> And some other html here bla bla <div>another bad div, ouch ouch</div>
Your suggestion will match this: Hello Wordl <p>Some info here</p><div id="bar">Ouch this is bad div</div> </div> And some other html here bla bla <div>another bad div, ouch ouch