[SOLVED] Fetching content and replacing links

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
Nailhead
Forum Newbie
Posts: 3
Joined: Tue Feb 22, 2005 9:29 pm

[SOLVED] Fetching content and replacing links

Post by Nailhead »

I am fetching content from a website for parsing and displaying it's contents. When I display the contents, all of the images and links are broken.

Here's a bit of code to get the contents from a website into $buffer and then search and replace the contents with ParseContents() and empty the remains into $content

Code: Select all

function GrabSite($url){ 
    $fd = fopen ($url, "r");
    while (!feof ($fd)) {
        $buffer = fread($fd, 4096);
        $content .= ParseContents($buffer);
    }
    fclose ($fd);
    echo $content;
}
Because the site is stored in a variable on my domain, the whole src="..." and href="..." links become wrong.

So i need a regular expression, which searches for these tags and replaces them with correct links. I need to keep in mind that links will sometimes be encapsulated in ' or " or nothing at all.

I've been experimenting for a week with no luck.
User avatar
shiznatix
DevNet Master
Posts: 2745
Joined: Tue Dec 28, 2004 5:57 pm
Location: Tallinn, Estonia
Contact:

Post by shiznatix »

dont know exactly how to do it but sounds like you should preg_replace the http://yoursite.com with http://theirsite.com
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

untested

Code: Select all

preg_match_all('#<&#1111;a-z]+(\s+&#1111;a-z]+\s*=\s*(&#1111;"''])?(.*?)\\2)*\s+(src|href)\s*=\s*(&#1111;"''])?(.*?)\\5.*?>#is', $text, $matches);

var_export($matches);

I believe I've posted something similar before... somewhere.. :?
Nailhead
Forum Newbie
Posts: 3
Joined: Tue Feb 22, 2005 9:29 pm

Post by Nailhead »

I may be doing this wrong but to view the array I'm using this:

Code: Select all

preg_match_all('#<&#1111;a-z]+(\s+&#1111;a-z]+\s*=\s*(&#1111;"''])?(.*?)\\2)*\s+(src|href)\s*=\s*(&#1111;"''])?(.*?)\\5.*?>#is', $buffer, $matches);

   foreach($matches&#1111;0] as $link) &#123;
     echo $link;
   &#125;
This results in displaying the original content with no changes. Do I need to do a preg_replace() next? Where would I write that?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

the example was just to show the regex to find the information.. you need a replace call, yes. You probably don't need a preg_match_all(). A properly written replacement pattern should take care of it..
Nailhead
Forum Newbie
Posts: 3
Joined: Tue Feb 22, 2005 9:29 pm

Post by Nailhead »

Thanks a lot for the help! This has me going in the right direction.
Post Reply