Getting Serious With PHP and Databases
Posted: Sun Mar 29, 2009 11:41 pm
Imagine an online car/truck/vehicle classified listing site with lots and lots of entries in it. It starts off small and is quite manageable, but eventually it gets big.
Eventually you need to get serious with PHP and databases and quit "goofing around".
The first thing I realized was that I needed to use PostgreSQL instead of MySQL. But that's another argument for another thread because that war isn't over. So, let's not go there in this thread.
The second thing I realized was that I need to stop using pg_* API in PHP5 and use the PDO API in PHP5. It's finally time to do this, and eventually I think there will be less and less bug fixes with the pg_* API. Do you agree? I only hope it's going to be just as fast.
The third thing I realized was that I need to start caching connections. Right now, my pages all get a fresh database connection each time. Luckily, Linux has some superb memory management where frequently accessed stuff in memory gets a pretty decent performance. And PHP5, as well, helps that. But eventually I think I'm going to have to start using APC memcache to pool connections. Do you agree? I only hope I won't have a lot of trouble with this.
The fourth thing I realized was that PostgreSQL, out of box, is tuned way back on its performance capabilities. It has a conf file that can be tweaked in all kinds of ways to really provide a lot of punch. Unfortunately it's complex and not a lot of people seem to understand it compared to MySQL performance tweaks.
The fifth thing I realized was that I need to create some sort of transaction log and backup scheme that is serious. Now, for instance in my years with M$ SQL Server, we used to do a full backup of the database every day, and then run an incremental transaction log backup on the hour. In the event of a corrupt database, we would kick out the users of the app and then pinpoint the point of failure. We would then restore a backup from that timeframe and then rerun transaction log restores for those records right before the point of failure, and then test, and then let our users back in again. Well, with PostgreSQL, I'm trying to figure out what to do. I decided to do a live full backup every 6 hours, and then use a SQLite3 database to store all UPDATE, INSERT, and DELETE statements as a kind of psuedo transaction log. As time moves on with our database, we'll change from 6 hour backups to 8, then 12, and then move to finding a robust solution that backs up our database with an hourly incremental backup and then a full backup daily. Do you think that's the best approach, or do you have a better idea?
Eventually you need to get serious with PHP and databases and quit "goofing around".
The first thing I realized was that I needed to use PostgreSQL instead of MySQL. But that's another argument for another thread because that war isn't over. So, let's not go there in this thread.
The second thing I realized was that I need to stop using pg_* API in PHP5 and use the PDO API in PHP5. It's finally time to do this, and eventually I think there will be less and less bug fixes with the pg_* API. Do you agree? I only hope it's going to be just as fast.
The third thing I realized was that I need to start caching connections. Right now, my pages all get a fresh database connection each time. Luckily, Linux has some superb memory management where frequently accessed stuff in memory gets a pretty decent performance. And PHP5, as well, helps that. But eventually I think I'm going to have to start using APC memcache to pool connections. Do you agree? I only hope I won't have a lot of trouble with this.
The fourth thing I realized was that PostgreSQL, out of box, is tuned way back on its performance capabilities. It has a conf file that can be tweaked in all kinds of ways to really provide a lot of punch. Unfortunately it's complex and not a lot of people seem to understand it compared to MySQL performance tweaks.
The fifth thing I realized was that I need to create some sort of transaction log and backup scheme that is serious. Now, for instance in my years with M$ SQL Server, we used to do a full backup of the database every day, and then run an incremental transaction log backup on the hour. In the event of a corrupt database, we would kick out the users of the app and then pinpoint the point of failure. We would then restore a backup from that timeframe and then rerun transaction log restores for those records right before the point of failure, and then test, and then let our users back in again. Well, with PostgreSQL, I'm trying to figure out what to do. I decided to do a live full backup every 6 hours, and then use a SQLite3 database to store all UPDATE, INSERT, and DELETE statements as a kind of psuedo transaction log. As time moves on with our database, we'll change from 6 hour backups to 8, then 12, and then move to finding a robust solution that backs up our database with an hourly incremental backup and then a full backup daily. Do you think that's the best approach, or do you have a better idea?