Hi All,
I´m trying to implement a blog reader here in Spain (like bloglines.com). My problem is the process I use to update news from the diferent blog feeds (actually 1400). This process is a Cron Job at server.
My problem is that the process takes more than twenty minutes to finish, and it grows as blog number grows.
The script does something like this:
- Go to BLOGS table and for each element, get the feed xml from the original site.
- I parse each XML to get the news inside.
- For each news, i make a "SELECT" to MySQL to confirm if is a new or old post.
- If is new post, i make the "INSERT" to mysql.
My question is if, somebody can explain me how to do this faster or more efficiently.
¿Write the INSERTs into a text file, and throw all at the end of the script?
¿Use threads? ¿How?
¿Maybe is better PERL or another technology to implement this process?
Thank you very much.
CRON JOBS: Syncronize thousand feeds
Moderator: General Moderators
I first would write this in C or in Perl to speed things up at processing.
The process is highly dependant on internet speed and on the response times of the blog servers.
You could split the bloglist into several cgi processes as I suspect that the server load is not what causes the delay.
Memory management might also be a solution as using mysql statements over and over in the same script uses up all memory in the end. You can free the memory after each blog to try this.
The process is highly dependant on internet speed and on the response times of the blog servers.
You could split the bloglist into several cgi processes as I suspect that the server load is not what causes the delay.
Memory management might also be a solution as using mysql statements over and over in the same script uses up all memory in the end. You can free the memory after each blog to try this.