Your biggest load and/or database

Ye' old general discussion board. Basically, for everything that isn't covered elsewhere. Come here to shoot the breeze, shoot your mouth off, or whatever suits your fancy.
This forum is not for asking programming related questions.

Moderator: General Moderators

Post Reply
User avatar
twinedev
Forum Regular
Posts: 984
Joined: Tue Sep 28, 2010 11:41 am
Location: Columbus, Ohio

Your biggest load and/or database

Post by twinedev »

Just curious as to what the largest load (site access and/or size of database) you deal with?

When you are used to smaller sites, optimizing isn't such a big concern, over the past two years, I have had the fun of two projects that forced me to learn to handle things differently.

1. was a database with a few thousand products, but also merged with many other databases, which when you went to a page, produced a 10-20 second page load. This taught me the use of temp tables to get the main list of what matched the "WHERE" clause, setting up indexes on that, and then doing a join to the others (these were databases synced over nightly from the client's POS software, and hindsight (I got put into this project in the middle), I would have built tables to build the databases better on our end instead of just rebuilding it the way it came in).

2. We have a site, that consistently each month get 30,000+ unique visitors, and 80,000+ total page loads. The required that just about everything in the site that you can link to be in a full drop down navigation, plus the site consisted of news, events, products (with many attributes tied to their databases, again), and applications, comments, and several places where that data needs randomized when displayed. The core site system we use utilizes smarty, however on this scale, combined with the fact they are in there every day changing/adding things, made it real fun. I ended having to write my own caching routine. Getting it to cache wasn't so bad, however dealing with trying to avoid what we had with using smarty, (any change in the Admin side blew away all cache) and switching it so there were triggers on cached items so say you edit a product, only the parts of the product you edited will delete caches matching was real fun.

So I know in my world, those are two of my bigger challenges, but I know for others you might be like "ha! few thousand, try few million". For those that have dealt with larger projects, what are some recommendations/considerations can you share for that compared to working on "smaller" sites?

-Greg
User avatar
Weirdan
Moderator
Posts: 5978
Joined: Mon Nov 03, 2003 6:13 pm
Location: Odessa, Ukraine

Re: Your biggest load and/or database

Post by Weirdan »

30-80 M hits/month,
200-400 K unique visitors/month,
~100GB database size (replicated to several db servers).
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Your biggest load and/or database

Post by Eran »

Ad affiliate network, showing ~100M impressions per month (almost all unique)
Key problems included -
* Showing random ads from the pool
* Performing large scale geo targeting
* Tracking multiple verticals (analytics) per impression / click, generating on demand reports and graphs

Project now runs on 30 servers. My advice - once you exceed the capacity of a single dedicated server - get a good server professional
josh
DevNet Master
Posts: 4872
Joined: Wed Feb 11, 2004 3:23 pm
Location: Palm beach, Florida

Re: Your biggest load and/or database

Post by josh »

My advice is to store normalized snapshots (essentially 'views' that you refresh when you chose), let read queries hit that table, and put hooks in the code to update this cache. So basically the user logs in and updates their ad. New data is flattened into a plain associative array and replaced into this snapshot table. Let your read queries hit that.

Apparently CQRS & event sourcing goes a step further than what I have done, 'events' become the main source of data, and only a de-normalized snapshot is stored in the database which duplicates that information. Think for example your bank, there are transactions [events] and the de-normalized snapshot [current balance/available balance].

So basically the answer to be short is: 'duplicate your data'
User avatar
twinedev
Forum Regular
Posts: 984
Joined: Tue Sep 28, 2010 11:41 am
Location: Columbus, Ohio

Re: Your biggest load and/or database

Post by twinedev »

While others have posted projects that blow mine away, I did just realize I posted from the wrong column earlier, they have 250,000+ page views a month, (average of 35gig transfer)
User avatar
Darhazer
DevNet Resident
Posts: 1011
Joined: Thu May 14, 2009 3:00 pm
Location: HellCity, Bulgaria

Re: Your biggest load and/or database

Post by Darhazer »

twinedev wrote:While others have posted projects that blow mine away, I did just realize I posted from the wrong column earlier, they have 250,000+ page views a month, (average of 35gig transfer)
250 000 page views a month is less than 10 000 per day!
One of the web sites I'm supporting has 40 000 unique visits per day, 200 000 page views, on the same server (1 web and 1 db) it's running a forum with 185 000 registed users and 1 300 000 posts currently, and the web site (not the forum) is powered by pretty old web system. So if you can't handle 10000 views per day, you have something very wrong with the database :) Check your slow query log and run explain queries.
Post Reply