CRON JOBS: Syncronize thousand feeds
Posted: Wed Nov 30, 2005 5:18 am
Hi All,
I´m trying to implement a blog reader here in Spain (like bloglines.com). My problem is the process I use to update news from the diferent blog feeds (actually 1400). This process is a Cron Job at server.
My problem is that the process takes more than twenty minutes to finish, and it grows as blog number grows.
The script does something like this:
- Go to BLOGS table and for each element, get the feed xml from the original site.
- I parse each XML to get the news inside.
- For each news, i make a "SELECT" to MySQL to confirm if is a new or old post.
- If is new post, i make the "INSERT" to mysql.
My question is if, somebody can explain me how to do this faster or more efficiently.
¿Write the INSERTs into a text file, and throw all at the end of the script?
¿Use threads? ¿How?
¿Maybe is better PERL or another technology to implement this process?
Thank you very much.
I´m trying to implement a blog reader here in Spain (like bloglines.com). My problem is the process I use to update news from the diferent blog feeds (actually 1400). This process is a Cron Job at server.
My problem is that the process takes more than twenty minutes to finish, and it grows as blog number grows.
The script does something like this:
- Go to BLOGS table and for each element, get the feed xml from the original site.
- I parse each XML to get the news inside.
- For each news, i make a "SELECT" to MySQL to confirm if is a new or old post.
- If is new post, i make the "INSERT" to mysql.
My question is if, somebody can explain me how to do this faster or more efficiently.
¿Write the INSERTs into a text file, and throw all at the end of the script?
¿Use threads? ¿How?
¿Maybe is better PERL or another technology to implement this process?
Thank you very much.