Reading RSS and Gzip compression

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
Coeus
Forum Newbie
Posts: 2
Joined: Sat May 22, 2010 12:25 am

Reading RSS and Gzip compression

Post by Coeus »

So what I'm after is very simple. I'm trying to retrieve this RSS page:
http://rss.binsearch.net/rss.php?max=50 ... multimedia

My problem is, when I use PHP to directly pull it, I get this error:
http://www.binsearch.info/ Due to excessive data traffic usage, using RSS clients that do not support (gzip) compression is no longer allowed! If you integrated any feeds into your own site and need assistence with implementing gzip support, contact us!<p>Regards,<br>Binsearch team Sat, 01 Oct 2005 07:02:27 GMT 86400

I've done plenty of digging on gzip and understand now that web browsers have it built in that if a page is compressing the data, the browser will retrieve it, uncompress it and display normally.


Any ideas on how I pull the data correctly with PHP?

Right now, all I'm doing is this:

<?php
$results = file_get_contents("http://rss.binsearch.net/rss.php?max=50 ... multimedia");
echo $results;
?>

Thank you!
User avatar
mecha_godzilla
Forum Contributor
Posts: 375
Joined: Wed Apr 14, 2010 4:45 pm
Location: UK

Re: Reading RSS and Gzip compression

Post by mecha_godzilla »

The question you have to ask yourself is how does the site know that your RSS 'client' doesn't have gzip support.

A couple of suggestions:

1. You could take the Binsearch team at their word and ask for their help.

2. Have a look at something like this

http://magpierss.sourceforge.net/

which has gzip support built in.

Two phrases come to mind: "no need to reinvent the wheel" and "every time you write a line of code an angel cries" :)

HTH,

Mecha Godzilla
User avatar
mecha_godzilla
Forum Contributor
Posts: 375
Joined: Wed Apr 14, 2010 4:45 pm
Location: UK

Re: Reading RSS and Gzip compression

Post by mecha_godzilla »

Just a quick update from my last message:

The results you're getting are exactly what happens if you telnet to the server and request that file. If you've got access to telnet on your system try this though (what you need to type in is in red):

$ telnet rss.binsearch.net 80
Trying 83.149.75.183...
Connected to rss.binsearch.net.
Escape character is '^]'.
GET /rss.php?max=50000&g=alt.binaries.multimedia HTTP/1.1
Host: binsearch.net
Accept-encoding: gzip
Accept: text/rss,application/rss+xml


(Press return after each line, then two at the end)

This should get you a nice lot of binary data. If you want to do this in your script, I think you'll need to use cURL to create the custom headers (though I have no experience with it so there may be other options available to you.)

M_G
Coeus
Forum Newbie
Posts: 2
Joined: Sat May 22, 2010 12:25 am

Re: Reading RSS and Gzip compression

Post by Coeus »

I thought I had replied to this already...

Thanks for your comments mecha_godzilla!

I looked in to cURL, and it did exactly what I was hoping. This code worked perfectly:

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://rss.binsearch.net/rss.php?max=50 ... multimedia");
curl_setopt($ch, CURLOPT_ENCODING , "gzip");
curl_exec($ch);
echo $ch;
curl_close($ch);
User avatar
John Cartwright
Site Admin
Posts: 11470
Joined: Tue Dec 23, 2003 2:10 am
Location: Toronto
Contact:

Re: Reading RSS and Gzip compression

Post by John Cartwright »

Yes, cURL would be the prefered solution. However, you do not need to specify the encoding, as it will be decoded automatically.
User avatar
Jonah Bron
DevNet Master
Posts: 2764
Joined: Thu Mar 15, 2007 6:28 pm
Location: Redding, California

Re: Reading RSS and Gzip compression

Post by Jonah Bron »

I think he's setting it that way so that the remote file doesn't start complaining.
User avatar
mecha_godzilla
Forum Contributor
Posts: 375
Joined: Wed Apr 14, 2010 4:45 pm
Location: UK

Re: Reading RSS and Gzip compression

Post by mecha_godzilla »

Jonah Bron wrote:I think he's setting it that way so that the remote file doesn't start complaining.
Yes, that's right - they've set their server up so that if you request their RSS files and haven't included the gzip encoding line in the headers you get this nice error message instead:

HTTP/1.1 200 OK
X-Powered-By: PHP/5.2.9
Content-Type: text/xml; charset="utf-8"
Transfer-Encoding: chunked
Date: Wed, 26 May 2010 20:22:29 GMT
Server: lighttpd/1.4.26

41f
<?xml version="1.0" encoding="utf-8"?><rss version="2.0">
<channel>
<title>Your RSS reader does not support (gzip) compression</title>
<link>http://www.binsearch.info/</link>
<description>
Due to excessive data traffic usage, using RSS clients that do not support (gzip) compression is no longer allowed!
If you integrated any feeds into your own site and need assistence with implementing gzip support, contact us!<p>Regards,<br>Binsearch team
</description>
<lastBuildDate>Sat, 01 Oct 2005 07:02:27 GMT</lastBuildDate>
<ttl>86400</ttl>
<item>
<title>
Your RSS reader does not support (gzip) compression
</title>
<description>
Due to excessive data traffic usage, using RSS clients that do not support (gzip) compression is no longer allowed!
If you integrated any feeds into your own site and need assistence with implementing gzip support, contact us!<p>Regards,<br>Binsearch team
</description><pubDate>Sat, 01 Oct 2005 07:02:27 GMT</pubDate><link>http://www.binsearch.info/</link></item>
</channel>
</rss>

0


Actually, it's not an error message - it's an RSS feed - just a really unhelpful one that you don't want ;)

M_G
Post Reply