query optimization

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

query optimization

Post by GeXus »

I have a couple queries where im doing joins on a varchar field and have a where clause on a varchar field... the table has about 200k records and it's going quite slow.. I have an index on the varchar field also.

How would you recommend optimizing this? It's going really slow.
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Post by GeXus »

Just an update.. I've delete some empty data and rebuilt indexes... query went from 11 seconds to 2.6 seconds. Not sure if there is anything else I can do aside from the query itself.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

You would probably need to post the SQL to get more specific suggestions.
(#10850)
User avatar
mikeq
Forum Regular
Posts: 512
Joined: Fri May 03, 2002 3:33 am
Location: Edinburgh, Scotland

Post by mikeq »

run an explain plan on your query(s)

Code: Select all

EXPLAIN
SELECT field1, field2 FROM table1 WHERE field3 = 'whatever';
That will show you what indexes are being used, how the tables are accessed etc
User avatar
infolock
DevNet Resident
Posts: 1708
Joined: Wed Sep 25, 2002 7:47 pm

Post by infolock »

Also, it sounds like to me that you have a layout issue more than anything else. Sounds like your indexing system is not set up correctly (ie, you have varchar's indexed when you should have tinyints, or you have integer fields when you should be using smallint, etc).
User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Post by onion2k »

I can't think of a situation in the past where I've ever joined something on varchar data before. That's very unusual.
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Post by GeXus »

Yeah, I know... I think i'll restructure the data and tables...

As far as indexing varchars, I don't see anything wrong with that.. I'm using mod_rewrite so all url's are formatted /product_name/, i don't want to add an ID in there, so I have to get the data based on the product_name (url formatted product name)
User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Post by onion2k »

GeXus wrote:As far as indexing varchars, I don't see anything wrong with that.. I'm using mod_rewrite so all url's are formatted /product_name/, i don't want to add an ID in there, so I have to get the data based on the product_name (url formatted product name)
That would only explain the use of the name in the where clause, that shouldn't have any bearing on the joins.
User avatar
infolock
DevNet Resident
Posts: 1708
Joined: Wed Sep 25, 2002 7:47 pm

Post by infolock »

The big issue behind using varchars as indexes, and then joining on them, is that normally when you do that, you are in essence going to cause a full stable scan, which means that every single solitary row in your table(s) are going to be queried.

Not to mention, you normally join on indexed fields that are 9 times out of 10 a primary key in another table. Joining on a varchar, as I said, will cause a longer join because it needs to validate text to be equal to text (thus, a much longer process on the processor than comparing tiny binary data such as integers).

again, i'd redesign the tables (= sounds like the relation-based layout you are going for it almost there, it's jst you are trying to link it all together with the wrong fields.

a very good article for you to read is : http://datamodel.org/NormalizationRules.html

it covers the basic concepts of normalization rules which is a very good and strict method of designing nice table layouts...
Last edited by infolock on Mon Mar 26, 2007 10:24 am, edited 2 times in total.
GeXus
Forum Regular
Posts: 631
Joined: Sat Mar 11, 2006 8:59 am

Post by GeXus »

You guys were right on the money... I added the id's to the table im joining on.. new query took 0.0087s vs. 4.8000s

Thank you!
User avatar
mikeq
Forum Regular
Posts: 512
Joined: Fri May 03, 2002 3:33 am
Location: Edinburgh, Scotland

Post by mikeq »

infolock wrote:The big issue behind using varchars as indexes, and then joining on them, is that normally when you do that, you are in essence going to cause a full stable scan, which means that every single solitary row in your table(s) are going to be queried.

.
Not true, that depends on many factors. If the varchar field (or combination of varchar fields) is a primary key it is going to be unique, the DBMS will use an index on this as it would on an Integer field. The only difference is an index lookup on an integer field is a bit quicker.

It is poor database design to stick in some arbitrary integer to represent a primary key when a natural primary key exists within your data.
User avatar
infolock
DevNet Resident
Posts: 1708
Joined: Wed Sep 25, 2002 7:47 pm

Post by infolock »

Not true, that depends on many factors. If the varchar field (or combination of varchar fields) is a primary key it is going to be unique, the DBMS will use an index on this as it would on an Integer field. The only difference is an index lookup on an integer field is a bit quicker.

I think you just said what I was saying (= I didn't say varchar's are never used, I only said they are rarely used (9 times out of 10 does not equal 100% ;) ). There are absolutely times to use varchars and times not to, while the times not to significantly outweigh the times to use them. And this is largely due to the SIZE of the indexes that are created by using a varchar field that exceeds a 30 character limit (say someone choosing to use a 50 varchar field size or worse yet a 255 size or even worse still a TEXT field for indexing).

While it is possible to index these, it's not always good to. If I have a table that has a list of the state of the United States of Amercia, I would definately index the varchar state field as there are not that many rows to index and the index size will not be too much. But when you start to abuse the ability to index a varchar, you are in essence going to destroy your query speed and see a major query result time when you have thousands of records or more.


It is poor database design to stick in some arbitrary integer to represent a primary key when a natural primary key exists within your data.
Now, i have no idea where this came from. If I create a primary key, that primary key is most normally either a UNIQUE key, or it is an autoincremented key. Either way, no one is saying to drop the primary key and replace it with some arbitrary integer. That would definately be absurd and break every rule known to man about proper table design. However, if you have a field in your table called "last name", and table called "user_info", and you are using the last_name as the indexed lookup key for both tables, it would absolutely be stupid to do so when you could have a primary integer-based autoincrementing field in a table to replace last_name with user_id and index/join off that.
User avatar
mikeq
Forum Regular
Posts: 512
Joined: Fri May 03, 2002 3:33 am
Location: Edinburgh, Scotland

Post by mikeq »

infolock wrote:However, if you have a field in your table called "last name", and table called "user_info", and you are using the last_name as the indexed lookup key for both tables, it would absolutely be stupid to do so when you could have a primary integer-based autoincrementing field in a table to replace last_name with user_id and index/join off that.
Unlikely that "last name" would be unique, so yes it would be absolutley stupid to retrieve the "user_info" with this method.

But lets say we have a user table where their user name is a 10 character unique varchar value

Table user
user_name (unique, varchar, primary key)
first_name
last_name

Table related_user_info
user_name (foreign key)
some_other
info_about

People reading your other post might think "ah using varchars in an index and for joining is bad, must create integer field to join on".

Table user
user_id (unique, integer, primary key)
user_name (unique, varchar)
first_name
last_name

Table related_user_info
user_id
some_other
info_about

So they end up creating integer fields all over the place when a perfectly good "natural" primary key exists.

I dont want to give the wrong impression to novices when it comes to designing databases. The first thing anyone should do when writing queries is run an explain plan against them to see exactly what the query is doing. Rather than having a guess at what it is doing.
User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Post by onion2k »

mikeq wrote:But lets say we have a user table where their user name is a 10 character unique varchar value
If it's always going to be 10 characters then you'd use a char rather than a varchar. But what you say holds true if it's variable length.
User avatar
mikeq
Forum Regular
Posts: 512
Joined: Fri May 03, 2002 3:33 am
Location: Edinburgh, Scotland

Post by mikeq »

sorry, should have said a max of 10 characters.
Post Reply