Page 1 of 1
When to use multi-column indexes
Posted: Sat Apr 12, 2008 1:00 pm
by GeXus
Here's the scenario I have, and I'm wondering if multi-column indexes would be preferred, or if not, in what cases would you want to use them..
Table columns:
one_id int(11),
two_id int(11),
ip_address int,
created_date datetime
The query:
Code: Select all
EXPLAIN SELECT count( * )
FROM table
WHERE one_id = '1'
AND created_date < now( )
GROUP BY date_format(created_date, '%m-%d-%y')
So, would it make sense to have the one_id and the created_date use a multi-column index? Or just index each one separately.
Re: When to use multi-column indexes
Posted: Sat Apr 12, 2008 1:10 pm
by bdlang
I actually read a very good
blog entry the other day that I think addresses your situation perfectly.
Re: When to use multi-column indexes
Posted: Sat Apr 12, 2008 1:59 pm
by GeXus
Thanks! Well, I think I understood it
From the table example above, I created the following index...
Code: Select all
CREATE INDEX one_id_date ON data ( one_id, created_date)
The query returns this when I do an explain...
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE data range one_id_date one_id_date 14 NULL 369576 Using where; Using index; Using temporary; Using filesort
Now when I use the same query, with separate indexes on the one_id and created_date... This is what I get
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE date ref one_id,created_dtate one_id 5 const 329852 Using where; Using temporary; Using filesort
So the difference is the first shows Using index and is type range as opposed to ref, which I'm not sure what that means.. but even with both, it's using temporary and filesort... Isn't there a more efficient way?
Re: When to use multi-column indexes
Posted: Sun Apr 13, 2008 11:30 am
by dml
* Type range as opposed to ref: when the index is on (one_id, date), the query looks for a range within that index (like looking for all the Smiths in the phonebook with first names from A-J), as opposed to an equality comparison when it's looking in an index on (one_id) for values ==1.
* Using index: when the index is on (one_id, date), the query can be entirely answered by consulting the index, it doesn't touch the underlying data tables at all.
* Using where: I'm not sure about that.
* Temporary/filesort: I suspect this is caused by the group by. Can you run the explain for the query without the group by to confirm.
* GROUP BY date_format(created_date, '%m-%d-%y'): this is putting the dates out of order, it's like searching for all the Smiths with first names from A-J and then grouping by (middle name, first name, last name). Can you try variations of this that preserve the order y-m-d, for example group by left(date, 10)... not sure if that specifically will work, the principle is to find something that the query planner knows can be satisfied in one pass just by keeping count, rather than by creating a temporary file and sorting it.