Page 1 of 1

Finding the URL of a redirected page

Posted: Sun Apr 16, 2006 10:38 am
by Longlands
I'm writing a program that will take a series of URLs and, one at a time, examine the source code of the target pages.

So far so good. It all works as expected, except for when one of the URLs is a redirect page. In those cases my script sees the source code of the page that is doing the redirecting, but not the source code of the final destination.

Can anyone suggest a way that I can get my php to 'see' the actual destination page, or at least report what its URL would be.

As the 'real' page will render client side, I suspect that javascript may be the answer, but can't get my brain round a method to do what I need.

Thanks - I hope! :?

Martin

Posted: Sun Apr 16, 2006 10:42 am
by feyd
That would depend on how the redirection is done and how you are requesting the pages.

Posted: Sun Apr 16, 2006 10:51 am
by Longlands
My goodness, that was fast! Thanks!

The URLs that are causing me problems are Clickbank hoplinks. The are like http://affid.vendid.hop.clickbank.net/

With normal URLs I am just using file_get_contents(url) - which of course can't see the redirected destination.

I had thought about loading the redirected URLs into an iframe, but don't know how to then view the source HTML of an iframe!

Martin

Posted: Sun Apr 16, 2006 11:15 am
by feyd
file_get_contents() follows header redirections therefore I would have to assume the page creates a page based redirection. Depending on how they wrote it, it's not so simple to pull out the URL they are redirecting to. If the redirection is in a <meta> tag, it's quite simple. Javascript will be harder to parse.