problem with atuomatic scraping when links with /.../ comes

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
Mehnaz
Forum Newbie
Posts: 20
Joined: Mon Jun 02, 2008 7:49 pm

problem with atuomatic scraping when links with /.../ comes

Post by Mehnaz »

HI

I am scraping top ten links from a search engine to grab their contents.

I got the error

" Warning: file_get_contents(http://www.nlm.nih.gov/.../druginfo/nat ... odine.html) [function.file-get-contents]: failed to open stream: HTTP request failed! HTTP/1.1 404 File not found in C:\wamp\www\websearch\searchdata.php on line 41"


when a url having this /.../ comes in. :? ( for example in this case http://www.nlm.nih.gov/.../druginfo/nat ... odine.html )

I am using file_get_contents() with preg_match_all() for getting titles. and wamp 2.0 with php 5.2.6

Any solution that would be helpful with these types of urls??

Thanks in advance

Mehnaz
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: problem with atuomatic scraping when links with /.../ comes

Post by requinix »

Take a look at your post. See how another ... showed up in that link? It's because the forum took your long URL and made it shorter by cutting out a part.

I bet the search tool you're scraping did the same thing. What does this mean?

You're screwed.

Find another way to do what you want. Perhaps that link shows up somewhere else in the result? (Spoiler: it does)
Post Reply