Storing 'data'

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

Post Reply
User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Storing 'data'

Post by onion2k »

What would be a sensible approach to storing 'data'? The data can be pretty much anything - a string, an int, a date, a float, a hex color value, an HTML snippet. The only criteria is that it'll always be less than 256 characters long when it's converted to a string.

I'm thinking that a varchar field is really the only logical approach, but it's complicated slightly by the fact I'll need to filter this data by various things including a date range ... it's possible to do "select * from `table` where date(`data`) between '2008-01-01' and '2008-03-31'" but isn't that going to be hellishly slow for large numbers of rows?

The only other approach I can think of is having one column per data type and putting the data into the one that best matches, but then all the SQL gets nasty because it'd need to search in multiple columns and fetch the right one. Yuck.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Storing 'data'

Post by Christopher »

VARCHAR is probably what you will end up with. You an also limit the index to a certain number of characters in the field to speed things up. The dates are only 10 bytes which is really not much bigger than an int, so the performance should be OK. I use date fields that are strings like 'yyyymmdd' sometimes and have not had a performance problem with 100k+ records.
(#10850)
User avatar
Bill H
DevNet Resident
Posts: 1136
Joined: Sat Jun 01, 2002 10:16 am
Location: San Diego CA
Contact:

Re: Storing 'data'

Post by Bill H »

If everything is going to include a date, wouldn't it make since to split the date out and put it into a one field, with the rest of the data in another? (Which I agree would make sense to use varchar.) That would simplify sorting and selecting, and you could index the date field.
User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Re: Storing 'data'

Post by onion2k »

Bill H wrote:If everything is going to include a date, wouldn't it make since to split the date out and put it into a one field, with the rest of the data in another? (Which I agree would make sense to use varchar.) That would simplify sorting and selecting, and you could index the date field.
The 'data' is a single field. The table will end up looking something like:

Code: Select all

ID      Ref       Type      Data
1       alpha     name      Chris
2       alpha     age       31
3       alpha     interest  PHP
4       alpha     birthday  1977-03-28
An example of a query might be I'll need to fetch all the references that have a birthday in March.

The challenge is that those 'type' fields will be defined by the user so they can't be columns.
User avatar
Benjamin
Site Administrator
Posts: 6935
Joined: Sun May 19, 2002 10:24 pm

Re: Storing 'data'

Post by Benjamin »

Hey Onion2k, sounds like you're working on a fun project 8)

You're definitely on the right track.

Here are a few tips.

1. For the type column, using an enum would give you the ability to have up to 65,535 different types of data using only 2 bytes. This allows you to represent the data type using identifiable strings rather than integers and saves space over using a varchar. The downside is that you'll need to add every single data type to the create table query. ('age', 'birthday', 'etc',...)

2. For the data column, using a varchar in this case is the best solution. My only advice is to keep it as short as possible. You may only need 50 characters or so which is much less than 256 multiplied by n rows.

3. You're going to be performing a lot of selects based on the type and data fields. Because of this they should be indexed together. The syntax is something like:

Code: Select all

 
ALTER TABLE tableName ADD INDEX tdata (type, DATA)
 
Doing this will increase performance significantly and you should have no problem with slow queries even with upwards of 10M rows. (my.cnf may need to be tweaked a little bit)

4. Writing queries to get data between two dates should work fine on a varchar, but I don't know for sure if it will. Even if it doesn't, which I doubt, I'm sure using the date formatting functions would allow you to accomplish the same: http://dev.mysql.com/doc/refman/5.0/en/ ... tions.html

Other than that you should be all set. Have fun! :)
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Storing 'data'

Post by Christopher »

For the 'type' column I would usually use a small integer type and have a separate table for the names. For the data field, you will probably want the size to be 255 for flexibility, but you can make the index shorter for performance. I agree that you may want to try indexing type and data together. Check the manual on that to see if if is the right option in this case. The dates should work, but you may want to store them as 'yyyymmdd' with no dashes. Especially if you need to always format differently than yyyy-mm-dd anyway.
(#10850)
Post Reply