Page 1 of 1

Storing images in a database...

Posted: Tue Dec 10, 2002 8:34 am
by puckeye
Hi all,

I've been hired to do a site with a lot of photos which need to be hiden from the normal web path.

I've toyed with the idea of storing images (ranging from 50 to 250Kb each) in a MySQL database. I estimate the number of images to be around 6,000 to 10,000 when the site is completed and running full speed.

Now at work we've had a few problems with tables over 15,000 records and I was wondering if it's solely the number of records or also the weight of each?

If other words is a table with 6,000 records weighting say 300,000 Kb (50Kb times 6,000) be a lot less effective then a table with 6,000 records only weighting 30,000Kb (averaging 5Kb per record to hold the filenames and stuff)?

If there is a difference in effectiveness how much of a difference are we talking about? Are we talking a few miliseconds, a few tenths, a few seconds???

Considere that at most there would be 10,000 records potentially weighting around 250Kb each.

Otherwise I was thinking of uploading the images outside the web path and using a PHP to retrieve them, the filenames would be stored in the db...

Thanks

Puckeye

store images in mysql

Posted: Tue Dec 10, 2002 10:24 am
by Igro
In case you want to have images stored in database table you must convert it into binary data and i guess that if you gonna have 10 000 images in you DB than your DB will be really slow while reading binary image data.

Store your images in some directory and put only a shorcut in your DB. This solution is the best.

Re: store images in mysql

Posted: Tue Dec 10, 2002 12:32 pm
by puckeye
Igro wrote:In case you want to have images stored in database table you must convert it into binary data and i guess that if you gonna have 10 000 images in you DB than your DB will be really slow while reading binary image data.

Store your images in some directory and put only a shorcut in your DB. This solution is the best.
Yeah I think you're right.

Thanks for the input.

Puckeye

Posted: Tue Dec 10, 2002 12:40 pm
by hedge
I disagree. If your DB and app is designed correctly, the number of rows in the DB will have very little to do with performance. Besides you need to store them somewhere, if they aren't in the db then you need to do file system deletes to clean up.

Posted: Tue Dec 10, 2002 1:09 pm
by puckeye
hedge wrote:I disagree. If your DB and app is designed correctly, the number of rows in the DB will have very little to do with performance. Besides you need to store them somewhere, if they aren't in the db then you need to do file system deletes to clean up.
Oh number of rows _does_ make a difference. I administer online newspapers and some of them had up to 12,000 articles stored in a single table. Lately we started noticing a perceptible slow down in performance. The newspapers only had about 150 to 200 articles online but had to sift through thousands to find them.

We separated online and offline articles into 2 tables and now the newspapers are noticably faster...

So I know that the db would be slower then normal but my earlier question was how much slower would it be to retrieve one record out of thousands if the smallest record weigh 50-60Kb, compared to less then 6Kb?

What if I create one table for each set? Each table would only hold 60 or so images, would that be a good way to store the images?

My fear is when we delete or modify sets of images, there will be some pictures left on the server, I know this kind of cleaning is way easier using databases, especially if I use one table per set... If I delete a set then there's no more pictures on the site...

Theres also the concern of disk fragmentation over a longer period of time, wouldn't a table of the MyISAM type help in that regard too?

Thanks all for your input.

Puckeye

Posted: Tue Dec 10, 2002 4:01 pm
by Johnm
Oh number of rows _does_ make a difference. I administer online newspapers and some of them had up to 12,000 articles stored in a single table. Lately we started noticing a perceptible slow down in performance. The newspapers only had about 150 to 200 articles online but had to sift through thousands to find them.
To an extent, yes, I agree to this but... how the table AND queries are set up makes a huge difference in the performance of a database. 12,000 entries in a table on a relational database should not be that big of a deal. (Consider how big hospital dbs are) I just did a count on a field of a table in our Informix db and found (to my surprise) that there are 1058539 records in that table and the query took about 5-10 seconds. Granted, more demanding queries will take longer but if the tables are well thought out, the queries optimized, tools like views (not necessarily views but we will use it for an example) are used there should be no problem in the size of the table.


John M

Posted: Tue Dec 10, 2002 6:09 pm
by puckeye
Johnm wrote: To an extent, yes, I agree to this but... how the table AND queries are set up makes a huge difference in the performance of a database. 12,000 entries in a table on a relational database should not be that big of a deal. (Consider how big hospital dbs are) I just did a count on a field of a table in our Informix db and found (to my surprise) that there are 1058539 records in that table and the query took about 5-10 seconds. Granted, more demanding queries will take longer but if the tables are well thought out, the queries optimized, tools like views (not necessarily views but we will use it for an example) are used there should be no problem in the size of the table.

John M
5 - 10 seconds... For our newspapers that's an eternity... Those clients unfortunately do not understand that they are 35 to share the same database with their own table mind you. They want instant access to all their pages.

On the other hand asking for a straight list or a count isn't that long compared to sifting through online dates and online switches. The ordering everything by a priority field that each client specified...

Anyway I disgress here.

I thank you John for you input.

I think that I'll create multiple tables one for each set, this will restrict the number of fields to around 60 per set with around 100 tables in the database.

I ask this:
Is one table with 6000 records better then 100 tables with 60 records each? All in the same database?

Thanks all

Michel

Posted: Tue Dec 10, 2002 6:14 pm
by puckeye
Another question came to mind...

What field type would be better to store an image a BLOB or TEXT? Or is it the same?

Any other type?

Michel

Posted: Tue Dec 10, 2002 9:23 pm
by Johnm
For the situation you describe I would probably avoid the LOB data type (BLOBs, CLOBs) as they cannot be included in primary keys, GROUP BY, or ORDER BY which for the type of sorting that you are implying would probably be necessary or at least much easier anyway. The LOB data type also has some consistency issues.

Creating a view for each user may be an option as they will have their records directly available to them and the performance would probably be improved.

We store parametric (3D) designs of equipment and between our five plants through a VPN (with approximately fifty simultaneous users) and after we optimized our queries and reconsidered how our tables were set up performance is no longer an issue.

I hope some of this helps.

John M

Posted: Tue Dec 10, 2002 9:56 pm
by puckeye
Johnm wrote:For the situation you describe I would probably avoid the LOB data type (BLOBs, CLOBs) as they cannot be included in primary keys, GROUP BY, or ORDER BY which for the type of sorting that you are implying would probably be necessary or at least much easier anyway. The LOB data type also has some consistency issues.
The only ORDER BY or GROUP BY that we'll do would be on the photo_id field. I don't see how or why I would ORDER BY (data from an image file).

I read a few things on the MySQL documentation and found out that BLOB and TEXT are essentially the same except the, as I understand it, BLOB is by default BINARY. Is that right?

I'm using MEDIUMBLOB in my tests and it's working good so far. BLOB by itself limits to 65Kb (16^4) so that was no good, but MEDIUMBLOB limits to around 16Mb(16^6) if I remember correctly.

Puckeye

Posted: Wed Dec 11, 2002 8:20 am
by Johnm
My apologies,
I wrote the last post with the idea in my head that we were still discussing newspaper articles rather than images. As I look back on the previous posts I see my error.
If you are storing images LOBs would be a logical choice. yes, BLOBs (Binary Large Object) are binary and CLOBs (Character Large Objects) are characters.

This may help:
BLOB 16^4 TEXT 16^4
TINYBLOB 16^2 TINYTEXT 16^2
MEDIUMBLOB 16^6 MEDIUMTEXT 16^6
LARGEBLOB 16^8 LARGETEXT 16^8


Later,
John M

Posted: Wed Dec 11, 2002 11:45 am
by BDKR
Theres also the concern of disk fragmentation over a longer period of time, wouldn't a table of the MyISAM type help in that regard too?
Use optimize table for this.

Also, whatever you do decide, use explain to take a close look at what your queries are doing and create indices on your tables. Use the slow query log too.

Anyways, we have a db server with multiple dbs that all have the same schema, but are for different customers. Some of those have row counts well over 700K lines. We don't or haven't seen a slow down on those. However, if there were no indices setup on some of those tables, we would be in big trouble as the mysql engine would be forced to do a worst case scenario full table scan.

Tons of memory and indices are the way to go!

Is your db server dedicated to just being a db server? Or is it doing other things as well? If it's the later, I would suggest putting the db on it's own box with over 1 gig of memory, not to mention another on for replication / backup.

Cheers,
BDKR[/quote]

Posted: Wed Dec 11, 2002 1:45 pm
by puckeye
Thanks BDKR and Johnm,

As far as the where and how the database is hosted I have no idea (I'll look into it though). All I know for now is that it's a shared host and not a small one so I would assume they are using a servers dedicated to MySQL and others for the web server.

I have indices on the field where I'll do the most search and sort. Fields like Set_ID and wheter the photo is a sample or not are indexed. The other fields are not indexed, the Photo BLOB, the height and width doesn't need to be because they'll only be accessed when a specific photo is asked for.

Thanks for everyone's opinions, most confirmed what I thinking and a few expended my knowledge. :)