Page 1 of 1
Fetching content
Posted: Fri Sep 17, 2004 8:38 pm
by jabbaonthedais
I need to fetch some content from another url. I don't want their whole site, just small portions, and link those portions to their site. Or maybe grab the links from their site, and link to the same places on mine. They have already consented to this, but I just need the code to do it.
Posted: Fri Sep 17, 2004 9:16 pm
by feyd
[php_man]file_get_contents[/php_man] can retrieve the page.. [php_man]preg_match[/php_man] and [php_man]preg_match_all[/php_man] can be used to extract the information.
Posted: Fri Sep 17, 2004 10:41 pm
by jabbaonthedais
I've been trying tons of functions, but can't get any to work with my particular need. I'm trying to grab links from a site. They are not seperated by lines. I just need the url and description for each link saved in an array, so I can call the variables like this:
Code: Select all
<a href="<?php print $url1; ?>" style="text-decoration:none" onMouseOver="window.status='<?php print $desc1; ?>';return true" onMouseOut="window.status=' '">Gallery 01</a> - <a href="<?php print $url6; ?>" style="text-decoration:none" onMouseOver="window.status='<?php print $desc6; ?>';return true" OnMouseOut="window.status=' '">Gallery 06</a><br>
<a href="<?php print $url2; ?>" style="text-decoration:none" OnMouseOver="window.status='<?php print $desc2; ?>';return true" onMouseOut="window.status=' '">Gallery 07</a> - <a href="<?php print $url7; ?>" style="text-decoration:none" onMouseOver="window.status='<?php print $desc7; ?>';return true" onMouseOut="window.status=' '">Gallery 06</a><br>
And all the way down to 5 and 10.
So basically I just need the first 10 links in a section of the page.
Posted: Fri Sep 17, 2004 11:09 pm
by feyd
the linked pages I posted can do these..
Posted: Sat Sep 18, 2004 5:28 am
by m3mn0n
What I do is [php_man]file_get_contents[/php_man]() the HTML, and then [php_man]explode[/php_man]() the part I want out of the HTML and then parse it to remove tags, and such.
It's worked wonders for a ton of sites. Even if they change much of the layout, and have dynamically generated content, it will still work. Same goes for [php_man]regex[/php_man] matching, as feyd mentioned.
Posted: Sat Sep 18, 2004 8:29 am
by jabbaonthedais
Someone on php.net suggested you should use preg_split rather than explode to split a string containing multiple seperators. The links I'm trying to get are in tables with <table>, <td>, <tr>, etc. that I don't need. But I don't need I seperator I don't think, because of all the extra code. I need the function to just extract the links from the page, including the text of the link, and number each one so I can call the first 10 on my page. Is this still the right direction?
Posted: Sat Sep 18, 2004 9:50 am
by feyd
the regular expression (preg_*) functions are still the direction to go.