Page 1 of 1
Email address database field size trade-off
Posted: Fri Dec 31, 2010 4:12 pm
by SidewinderX
RFC5321 says the local part of an email address can be a maximum of 64 characters and the domain name can be a maximum of 255 characters, plus the @ symbol means an email address can be a maximum of 320 characters.
Is it worth making the database field a length 320 characters (and probably waste ~200+ bytes per record) or would it be better to limit the field length to something more reasonable and exclude large email addresses. If the latter, would you consider "reasonable?"
Lets assume this is a MySQL database is for twitter.com where large email addresses could potentially happen, but efficiency is also a huge concern.
Re: Email address database field size trade-off
Posted: Sat Jan 01, 2011 5:25 am
by social_experiment
SidewinderX wrote:...could potentially...
If efficiency is a huge concern then having a potential inconvenience is a risk i'd be willing to take. The collective is more important than the individual in this case, the individual being somebody who might have an email address that matches those specific criteria set out by RFC5321. You can also take into account human behaviour, in the current day people don't write things down so everything is digitally stored. How this is relevant is that when people swap contact info the are looking for information that will be easily remembered i.e short, to the point information.
An email address like
ohmygoodness@thisisthelongestemailaddyever.com might look good on paper (and get a single ha-ha from some uninformed non-IT person) but it's useless for anything else IMO. Go with a conservative value, maybe 255 VC (or less). You can always notify the user about length restrictions (it's done all the time) and the one that 'could potentially' find it a problem will be in the minority of users.
Re: Email address database field size trade-off
Posted: Sat Jan 01, 2011 11:50 am
by Darhazer
If you are using InnoDB table, use VARCHAR(320). It won't waste the space that is not used.
If you are using MyISAM however, better waste some space, as MyISAM is extremly fast in working with fixed-length rows
By the way it will have 100 million records for 100 million users, email being the primary key, the lookup will be still extremly fast, and you can easily partition the table based on the first letter of the address
