Page 1 of 1

Confusion about sizes for optimization of dbs

Posted: Sun Jul 03, 2005 11:06 am
by Todd_Z
I never know what to enter in as the int size in a mySQL table. For example, if I am using iptolong() for storing ip addresses, what should I use as the size. Has anyone seen a good tutorial or explanation on how to optimize sizes for all types of variables in a table? Thanks.

Posted: Sun Jul 03, 2005 11:33 am
by timvw
http://dev.mysql.com/doc/mysql/en/choosing-types.html

An IPv4 address is 32 bits long -> INT UNSIGNED is exactly what you need ;)

Posted: Sun Jul 03, 2005 11:52 am
by Todd_Z
For example I am keeping track of timestamps in a db, and I'm removing the length since jan 1 2005 to make the value smaller. The value today is around 15853500, and I want this to work for the next 10 years at least, so how do I know the size that I should use. I know its between unsigned medint(16777215) and bigint(4294967295).

Timestamp for dec 31st 2015 = 1451548800
Timestamp for jan 1 2005 = 1104566400
Difference = 346982400

Thats the largest value that will be stored.

Posted: Sun Jul 03, 2005 3:27 pm
by timvw
0346982400 (largest value)

0016777215 (unsigned medint)
4294967295 (bigint)


Well, there are values that simply wouldn't fit in a unsigned medint.

Algorithm to choose the right one. Choose all the types that are capable to contain all the possible values. Now choose the smallest one of the found types...

Btw, it's plain silly to try and save a few bytes by choosing 2005 as offset for your timestamps. The costs in changing your code are far larger than buying an extra harddisk for your server...

Posted: Mon Jul 04, 2005 9:09 am
by Todd_Z
I haven't started the project yet. It's going to be storing traffic information for many hundreds of sites, so I need to make it really small and fast and possible.

Posted: Mon Jul 04, 2005 10:54 am
by timvw
Well, i've been tracking a few of our banners.. Ended up with more than 900.000 rows in a month (+/- 20mb) and i doubt if someone even noticed it...

I think you'll save more CPU cycles if you use the built-in unixtimestamp instead of calculating your own timestamp (offset 2005-01-01)


I only need the totals per partner/country for the previous periods, so every month i calculate those totals for the last month, and insert them in a archive table. Then i delete the rows the rows from the previous month(s).

This way the current table stays relatively small...