Page 1 of 1
My little project. Looking for a little push
Posted: Mon Aug 24, 2009 3:37 pm
by soul_fly
Hello, I'm Mahmud. This is my first post in this forum. I'm somehow still newbie in php. I was stucked into a little project of mine. Therefore, I will be very pleased if you give your effort to show some detailed guideline to solve this project

I will try my best to give the complete scenario:
The assignment:
I have been given a bunch of urls into a text file from a same website and have been asked to parse image for each url.
like:
domain.com/mahmud.html
domain.com/Jean.html
domain.com/Reese.html
sample html code inside domain.com/Reese.html as follows:
<div class="profile-top">
<div class="enlisted">
<h2 class="thumb clearfix">
<a href="/account/profile_image/Reese?hreflang=en"><img alt="" border="0" height="73" id="profile-image" src="
http://domain.com/profile_images/Reese.jpg" valign="middle" width="73" /></a>
Reese
</h2>
</div>
*** target image
http://domain.com/profile_images/Reese.jpg
Output should look like:
domain.com/mahmud.html show image
domain.com/Jean.html show image
domain.com/Reese.html show image
A good guideline will be highly appreciated. Thanks in advance.
Re: My little project. Looking for a little push
Posted: Mon Aug 24, 2009 4:05 pm
by requinix
Assignment? Like for school?
What have you been learning recently?
Re: My little project. Looking for a little push
Posted: Mon Aug 24, 2009 6:01 pm
by jackpf
I don't really understand...but you probably want to look into DOM.
Re: My little project. Looking for a little push
Posted: Tue Aug 25, 2009 2:05 am
by soul_fly
McInfo wrote:You can use
Client URL Library or a
Filesystem functions to load the file into a string. Once the html is in a string, use
XML Manipulation or
Regular Expressions functions to isolate the data you need. With XML objects, you can skip loading the file into a string and load it directly into the object.
How to pull the src value out of img tags:
Code: Select all
<?php
header('Content-Type: text/plain');
$html = '<div class="profile-top">
<div class="enlisted">
<h2 class="thumb clearfix">
<a href="/account/profile_image/Reese?hreflang=en"><img alt="" border="0" height="73" id="profile-image" src="http://domain.com/profile_images/Reese.jpg" valign="middle" width="73" /></a>
Reese<img src="gobi.png">
</h2>
</div>';
$matches = array();
/*
* # = start of pattern
* <img = beginning of an image tag
* \s = any whitespace character
* ( = beginning of a subpattern
* ?: = makes subpattern non-capturing
* . = any character except newline
* * = zero or more quantifier
* ? = ungreedy
* ) = end of a subpattern
* src=" = beginning of image tag source attribute
* (.*?) = capturing subpattern, any character except newline, zero or more, ungreedy
* " = end of image tag source attribute
* (?:.*?) = non-capturing subpattern, any character except newline, zero or more, ungreedy
* > = end of image tag
* # = end of pattern
* i = caseless option
*/
preg_match_all('#<img\s(?:.*?)src="(.*?)"(?:.*?)>#i', $html, $matches);
print_r($matches);
print_r($matches[1]);
echo $matches[1][0];
?>
Good Morning. Actually great morning. Thanks to McInfo from my deep heart for wonderful effort. Let me try. I will knock again if I face any problem.

Re: My little project. Looking for a little push
Posted: Wed Aug 26, 2009 7:05 am
by soul_fly
Thanks McInfo. Yes, it works like charm:
preg_match_all('#<img\s(?:.*?)src="(.*?)"(?:.*?)>#i', $str, $matches);
but what if the html code is follows and my only target url is
http://domain.com/profile_images/Reese.jpg
I just need help in the preg_match_all part:
<div class="profile-down">
<div class="blahblah">
<h2 class="thumb clearfix">
<a href="/account/profile_image/anyone?hreflang=en"><img alt="" border="0" height="73" id="profile-image" src="
http://domain.com/profile_images/anyone.jpg" valign="middle" width="73" /></a>
anyone
</h2>
</div>
<div class="profile-top">
<div class="enlisted">
<h2 class="thumb clearfix">
<a href="/account/profile_image/Reese?hreflang=en"><img alt="" border="0" height="73" id="profile-image" src="
http://domain.com/profile_images/Reese.jpg" valign="middle" width="73" /></a>
Reese
</h2>
</div>
I tried : preg_match_all('#id="profile-image"\ssrc="(.*?)"(?:.*?)>#i', $html, $matches);
but it outputs:
id="profile-image" src="
http://domain.com/profile_images/Reese.jpg" valign="middle" width="73" />
i just only want the jpg

Re: My little project. Looking for a little push
Posted: Thu Aug 27, 2009 12:26 am
by soul_fly
Thanks McInfo.
echo $matches[0][1]; worked to get the desired result. Once again thank you.