RSS caching

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

RSS caching

Post by Ambush Commander »

Should RSS be cached? My feed-generation software has been generating .rss files, I recently realized that Apache was serving them as text/plain (naughty Apache!) After changing their content-type and giving them a default character set, I can't help but wonder whether or not Apache's aggressive caching mechanism is inappropriate for RSS feeds. Is it?
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

I believe they should, but there are different issues that for web pages. I recently read a couple of blogs with interesting and detailed information about caching RSS. You might want to Google for them.
(#10850)
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

I think they should as well. Not only can it save you some processing cycles, but it can also quicken the response time for people reading it. It just happens to be delayed a tiny bit, maybe, depending on how you do the caching.

I thought recently to use a cron to generate a static file once per minute (or maybe 30 seconds) using a data cache stored in a database for example.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Sorry about resurrecting the thread. I thought that I had resolved the issue, but there seems to be a very definite problem.

When Firefox requests the XML feed from the server, it gets returned as such (with default settings):

Code: Select all

Status=OK - 200
Date=Mon, 02 Apr 2007 04:11:28 GMT
Server=Apache/2.0.54 (Unix) PHP/4.4.4 mod_ssl/2.0.54 OpenSSL/0.9.7e mod_fastcgi/2.4.2 DAV/2 SVN/1.3.2
Last-Modified=Mon, 02 Apr 2007 03:52:59 GMT
Etag="14914a1-1479-28377cc0"
Accept-Ranges=bytes
Content-Length=5241
Keep-Alive=timeout=15, max=100
Connection=Keep-Alive
Content-Type=application/rss+xml; charset=utf-8
Something about these headers, however, causes Firefox to cache any successive request for the site. When I update the feed, and then attempt to load it with my RSS reader Sage, the changes don't come up. Even when I manually click the feed. I end up having to view-source:http://hp.jpsband.org/news.rss and Ctrl+F5 it to get the updated content to show up.

This is, to say the least, completely unacceptable. Does anyone know why this is happening?
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Vary the ETag, provide an Expires or pass caching control headers?
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Apache should be recalculating the ETag whenever the file changes, so that's really not in my control. Do I just bite the bullet and pass a no-cache header? Maybe always-revalidate would work... I really don't want to use Expires, because that means stale feeds.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

Ambush Commander wrote:really don't want to use Expires, because that means stale feeds.
But that's exactly what you're wanting to do. After some period of time the feed becomes stale and the agent should therefore request a new copy.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

I wasn't clear about what I wanted, sorry.

Since RSS feeds are volatile data, I'd like the user-agent to always check with the server whether or not a new version is available. If the server replies 304, fine, don't download it, but I want it to check every time.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

always-revalidate sounds like a decent stop gap then. No-cache may be required for some browsers however.
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Uh oh. Does that mean Apache has to detect when those other browsers come around? Remember, these headers are being set by the webserver, not a PHP script.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

It doesn't hurt to send the no-cache header at the same time for most browsers.
Post Reply