Page 1 of 1
RSS caching
Posted: Wed Mar 28, 2007 8:47 pm
by Ambush Commander
Should RSS be cached? My feed-generation software has been generating .rss files, I recently realized that Apache was serving them as text/plain (naughty Apache!) After changing their content-type and giving them a default character set, I can't help but wonder whether or not Apache's aggressive caching mechanism is inappropriate for RSS feeds. Is it?
Posted: Wed Mar 28, 2007 10:12 pm
by Christopher
I believe they should, but there are different issues that for web pages. I recently read a couple of blogs with interesting and detailed information about caching RSS. You might want to Google for them.
Posted: Thu Mar 29, 2007 8:55 am
by feyd
I think they should as well. Not only can it save you some processing cycles, but it can also quicken the response time for people reading it. It just happens to be delayed a tiny bit, maybe, depending on how you do the caching.
I thought recently to use a cron to generate a static file once per minute (or maybe 30 seconds) using a data cache stored in a database for example.
Posted: Sun Apr 01, 2007 11:18 pm
by Ambush Commander
Sorry about resurrecting the thread. I thought that I had resolved the issue, but there seems to be a very definite problem.
When Firefox requests the XML feed from the server, it gets returned as such (with default settings):
Code: Select all
Status=OK - 200
Date=Mon, 02 Apr 2007 04:11:28 GMT
Server=Apache/2.0.54 (Unix) PHP/4.4.4 mod_ssl/2.0.54 OpenSSL/0.9.7e mod_fastcgi/2.4.2 DAV/2 SVN/1.3.2
Last-Modified=Mon, 02 Apr 2007 03:52:59 GMT
Etag="14914a1-1479-28377cc0"
Accept-Ranges=bytes
Content-Length=5241
Keep-Alive=timeout=15, max=100
Connection=Keep-Alive
Content-Type=application/rss+xml; charset=utf-8
Something about these headers, however, causes Firefox to cache any successive request for the site. When I update the feed, and then attempt to load it with my RSS reader Sage, the changes don't come up. Even when I manually click the feed. I end up having to view-source:
http://hp.jpsband.org/news.rss and Ctrl+F5 it to get the updated content to show up.
This is, to say the least, completely unacceptable. Does anyone know why this is happening?
Posted: Mon Apr 02, 2007 12:45 am
by feyd
Vary the ETag, provide an Expires or pass caching control headers?
Posted: Mon Apr 02, 2007 7:41 am
by Ambush Commander
Apache should be recalculating the ETag whenever the file changes, so that's really not in my control. Do I just bite the bullet and pass a no-cache header? Maybe always-revalidate would work... I really don't want to use Expires, because that means stale feeds.
Posted: Mon Apr 02, 2007 8:34 am
by feyd
Ambush Commander wrote:really don't want to use Expires, because that means stale feeds.
But that's exactly what you're wanting to do. After some period of time the feed becomes stale and the agent should therefore request a new copy.
Posted: Mon Apr 02, 2007 8:37 am
by Ambush Commander
I wasn't clear about what I wanted, sorry.
Since RSS feeds are volatile data, I'd like the user-agent to always check with the server whether or not a new version is available. If the server replies 304, fine, don't download it, but I want it to check every time.
Posted: Mon Apr 02, 2007 8:40 am
by feyd
always-revalidate sounds like a decent stop gap then. No-cache may be required for some browsers however.
Posted: Mon Apr 02, 2007 8:41 am
by Ambush Commander
Uh oh. Does that mean Apache has to detect when those other browsers come around? Remember, these headers are being set by the webserver, not a PHP script.
Posted: Mon Apr 02, 2007 8:44 am
by feyd
It doesn't hurt to send the no-cache header at the same time for most browsers.