Page 1 of 1
saving sites with question marks and other symbols
Posted: Mon Aug 29, 2005 11:53 pm
by jaymoore_299
I am trying to develop a simple code to display page within pages. However, I am having trouble with sites that have special symbols like ? and = . Here is the code I use.
Code: Select all
<?php
ob_start();
include('http://www.somesite.com/index.php?id=1');
$inc = ob_get_contents();
ob_end_clean();
print $inc;
?>
When I do it with this site, all the images come out with red x's on them and their source is incorrect.
Posted: Tue Aug 30, 2005 12:20 am
by jaymoore_299
I have a different problem in addition to the above. For the first problem, if I had any characters like = or ?, the url's output couldn't be saved. But even in the case when it can be saved, some of the pages have images with the wrong source and so they are not displayed.
The problem is that in the page source, the images have relative urls like this.
<IMG SRC="/images/
the page with relative urls has to be processed before it is displayed. Is there any way to get the processed version of the page with only absolute urls in it? Or is there a php script that checks for relative urls and changes them to absolute ones?
I have no creative control over the pages I seek to include so I can't personally go in and change the urls to absolute myself.
Posted: Tue Aug 30, 2005 12:25 am
by feyd
images are transmitted in seperate streams. The browser must resolve them in some fashion. You'll need to inject html or alter the html the page outputs to get that.
Using include() is
extremely dangerous, as any PHP in the output will be executed by your server.
file_get_contents() is often preferred, however
cURL can be used to send full user-agent headers, among other things. Even making POST submissions.