Just curious as to what the largest load (site access and/or size of database) you deal with?
When you are used to smaller sites, optimizing isn't such a big concern, over the past two years, I have had the fun of two projects that forced me to learn to handle things differently.
1. was a database with a few thousand products, but also merged with many other databases, which when you went to a page, produced a 10-20 second page load. This taught me the use of temp tables to get the main list of what matched the "WHERE" clause, setting up indexes on that, and then doing a join to the others (these were databases synced over nightly from the client's POS software, and hindsight (I got put into this project in the middle), I would have built tables to build the databases better on our end instead of just rebuilding it the way it came in).
2. We have a site, that consistently each month get 30,000+ unique visitors, and 80,000+ total page loads. The required that just about everything in the site that you can link to be in a full drop down navigation, plus the site consisted of news, events, products (with many attributes tied to their databases, again), and applications, comments, and several places where that data needs randomized when displayed. The core site system we use utilizes smarty, however on this scale, combined with the fact they are in there every day changing/adding things, made it real fun. I ended having to write my own caching routine. Getting it to cache wasn't so bad, however dealing with trying to avoid what we had with using smarty, (any change in the Admin side blew away all cache) and switching it so there were triggers on cached items so say you edit a product, only the parts of the product you edited will delete caches matching was real fun.
So I know in my world, those are two of my bigger challenges, but I know for others you might be like "ha! few thousand, try few million". For those that have dealt with larger projects, what are some recommendations/considerations can you share for that compared to working on "smaller" sites?
-Greg
Your biggest load and/or database
Moderator: General Moderators
Re: Your biggest load and/or database
30-80 M hits/month,
200-400 K unique visitors/month,
~100GB database size (replicated to several db servers).
200-400 K unique visitors/month,
~100GB database size (replicated to several db servers).
Re: Your biggest load and/or database
Ad affiliate network, showing ~100M impressions per month (almost all unique)
Key problems included -
* Showing random ads from the pool
* Performing large scale geo targeting
* Tracking multiple verticals (analytics) per impression / click, generating on demand reports and graphs
Project now runs on 30 servers. My advice - once you exceed the capacity of a single dedicated server - get a good server professional
Key problems included -
* Showing random ads from the pool
* Performing large scale geo targeting
* Tracking multiple verticals (analytics) per impression / click, generating on demand reports and graphs
Project now runs on 30 servers. My advice - once you exceed the capacity of a single dedicated server - get a good server professional
Re: Your biggest load and/or database
My advice is to store normalized snapshots (essentially 'views' that you refresh when you chose), let read queries hit that table, and put hooks in the code to update this cache. So basically the user logs in and updates their ad. New data is flattened into a plain associative array and replaced into this snapshot table. Let your read queries hit that.
Apparently CQRS & event sourcing goes a step further than what I have done, 'events' become the main source of data, and only a de-normalized snapshot is stored in the database which duplicates that information. Think for example your bank, there are transactions [events] and the de-normalized snapshot [current balance/available balance].
So basically the answer to be short is: 'duplicate your data'
Apparently CQRS & event sourcing goes a step further than what I have done, 'events' become the main source of data, and only a de-normalized snapshot is stored in the database which duplicates that information. Think for example your bank, there are transactions [events] and the de-normalized snapshot [current balance/available balance].
So basically the answer to be short is: 'duplicate your data'
Re: Your biggest load and/or database
While others have posted projects that blow mine away, I did just realize I posted from the wrong column earlier, they have 250,000+ page views a month, (average of 35gig transfer)
Re: Your biggest load and/or database
250 000 page views a month is less than 10 000 per day!twinedev wrote:While others have posted projects that blow mine away, I did just realize I posted from the wrong column earlier, they have 250,000+ page views a month, (average of 35gig transfer)
One of the web sites I'm supporting has 40 000 unique visits per day, 200 000 page views, on the same server (1 web and 1 db) it's running a forum with 185 000 registed users and 1 300 000 posts currently, and the web site (not the forum) is powered by pretty old web system. So if you can't handle 10000 views per day, you have something very wrong with the database