Page 1 of 1

Index on table influencing select order?

Posted: Sat Sep 06, 2008 9:53 am
by matthijs
I discovered something strange, or at least something I wasn't aware of.

Setting a unique key on a table influences the order by of a SELECT * FROM thetable

Say I have

Code: Select all

 
CREATE TABLE `users` (
    `user_id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT,
    `user_name` varchar(255) NOT NULL,
    `user_firstname` varchar(255) NOT NULL,
    `user_lastname` varchar(255) NOT NULL,
    PRIMARY KEY  (`user_id`)
)  ENGINE=InnoDB;
 

Now a "SELECT * FROM users" will order the results by user_id

Code: Select all

 
CREATE TABLE `users` (
    `user_id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT,
    `user_name` varchar(255) NOT NULL,
    `user_firstname` varchar(255) NOT NULL,
    `user_lastname` varchar(255) NOT NULL,
    PRIMARY KEY  (`user_id`),
UNIQUE KEY (`user_name`, `user_firstname`,`user_lastname`)
)  ENGINE=InnoDB;
 

Now a "SELECT * FROM users" will order the results by user_name.

Why is that?

Or is there no why and is this just the way mysql handles things? I didn't know about this behavior. led to some unexpected errors in my Unit tests..

Re: Index on table influencing select order?

Posted: Sat Sep 06, 2008 10:38 am
by dml
It looks like you've got a covering index for the select *, that is that the query can be fulfilled entirely from the data in the index on (name, lastname, firstname) without touching the data tables at all. You can check this by putting EXPLAIN in front of the query: it should return type=index to indicate an index scan, and "using index" to indicate that it's got a covering index.

This should only happen in InnoDB, whose indexes refer to rows using the primary key as opposed to pointing directly to data rows as in MyISAM.

If you don't mind me asking, why are you writing a test that relies on deterministic order from a select without an order clause? It might just be deterministic in InnoDB, but it seems a bit fragile to rely on that.

Re: Index on table influencing select order?

Posted: Sat Sep 06, 2008 11:12 am
by matthijs
Thanks for your reply.

The EXPLAIN SELECT * FROM users returns

Code: Select all

 
id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  users   index   NULL    user_name   771     NULL    2   Using index
 
What exactly do you mean with "a covering index"?

About the tests: there was not a specific reason I wrote the tests relying on the order. It was just a way to test exactly what has been put in the database what I expected to have been put in there. Something like:

Code: Select all

 
public function testInsertTwoUsers(){
 // the inserts of 2 rows here
     
// then the tests, using a private helper method to pull out the 2 rows I just inserted
        $users = $this->selectAllUsers();
        $this->assertIdentical($users[0]['user_name'], 'james');
        $this->assertIdentical($users[1]['user_name'], 'henri');
}
 
private function selectAllUsers(){
$sql = "SELECT * FROM users";
// etc
}
 
I haven't done a lot of Unit testing yet, so it's still exploring for me. I realize now that this method is a bit fragile.

Re: Index on table influencing select order?

Posted: Sat Sep 06, 2008 11:54 am
by dml
Here's some reading that explains covering indexes. The first article also explains why select * initially came back in id order: Baron Schwarz, Peter Zaitsev, and Jay Pipes.

The line of thought I'd follow is that you decide on a public interface for your component, a set of guarantees to programmers using it, and you code against that interface. If selectAllUsers() guarantees that the users will come back in id order, then your implementation should include "order by id", and your tests should be designed to fail if the results don't come back in id order. On the other hand if selectAllUsers() doesn't guarantee any order, then any code (be it test code or production code) that relies on the result being in a given order needs to do its own sorting. It may happen that in the current implementation it comes back in a certain order, but as you've seen, it's fragile to rely on this always being the case.

Re: Index on table influencing select order?

Posted: Sat Sep 06, 2008 12:16 pm
by matthijs
Thanks for the links and answer.

You make a good point about the tests. I knew about the testing to an interface issue (that's what I try to do), but haven't gone far enough apparently. Thanks for pointing that out.