MAX() and GROUP BY

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

Galahad
Forum Contributor
Posts: 111
Joined: Fri Jun 14, 2002 5:50 pm

MAX() and GROUP BY

Post by Galahad »

I have a table of people. The table includes an Id, a version, last name, and first name. The Id and version together are the unique identifier. I am trying to get just the last version of the person, and none before it. Here's an example:

Code: Select all

SELECT Id, Version, FirstName, LastName FROM people WHERE Id='0';
Id 0 is just an example of a record with multiple versions. This query returns:

Code: Select all

+----------+---------+-----------+--------------+
|       Id | Version | FirstName | LastName     |
+----------+---------+-----------+--------------+
|        0 |       0 | Old       | Name         |
|        0 |       1 | New       | Name         |
+----------+---------+-----------+--------------+
I need to get only the newest version. I tried doing this:

Code: Select all

SELECT Id, MAX(Version), FirstName, LastName FROM people WHERE Id='0' GROUP BY Id;
That query returns:

Code: Select all

+----------+--------------+-----------+----------+
|       Id | MAX(Version) | FirstName | LastName |
+----------+--------------+-----------+----------+
|        0 |            1 | Old       | Name     |
+----------+--------------+-----------+----------+
As you can see, this is is a mix of the version 0 and 1 data (old names). "Is GROUP BY" the way to go? If not, how should I approach it? If so, what am I donig wrong? I'd appreciate a pointer towards the right direction. Thanks for the help.
User avatar
grooou
Forum Newbie
Posts: 8
Joined: Tue Jun 03, 2003 1:21 pm
Location: Portugal

Post by grooou »

please try with subselect:

Code: Select all

SELECT Id, Version, FirstName, LastName
FROM people
WHERE Id='0'
and Version = (select max(Version) from people where Id='0')
hedge
Forum Contributor
Posts: 234
Joined: Fri Aug 30, 2002 10:19 am
Location: Calgary, AB, Canada

Post by hedge »

That will work for the person with id=0, to get all people:

Code: Select all

SELECT Id, Version, FirstName, LastName 
  FROM people as p1
 WHERE Version = (select max(Version) from people as p2 where Id=p1.Id)
BTW, it's called a correlated subquery
Galahad
Forum Contributor
Posts: 111
Joined: Fri Jun 14, 2002 5:50 pm

Post by Galahad »

I tried:

Code: Select all

SELECT Id, Version, LastName, FirstName FROM people WHERE Id='0' AND Version=(SELECT MAX(Version) FROM people WHERE Id='0');
I got the following error:

Code: Select all

ERROR 1064: You have an error in your SQL syntax.  Check the manual that corresponds to your MySQL server version for the right syntax to use near 'SELECT MAX(Version) FROM people WHERE Id='0')' at line 1
The "SELECT MAX(Version)..." query works fine on it's own.

I also tried hedge's example, it gave me the same thing. I even tried:

Code: Select all

SELECT Id, Version, FirstName, LastName FROM people as p1 WHERE Version = (select 1);
but still got a syntax error. Are you sure that you can embed a select in another select statement? Have you actually done a "correlated subquery" before? Do you have to enable that in the configuration or something? I'm running MySQL 4.0.13, what versions are you guys running?
User avatar
grooou
Forum Newbie
Posts: 8
Joined: Tue Jun 03, 2003 1:21 pm
Location: Portugal

Post by grooou »

sorry for the wrong tip.

i'm not running mysql.

i believe that version 4.1 is going to allow subselects (or correlated subqueries). This is a quote from mysql.com:
Expanded support for subqueries
Subqueries allow you to use the result of one query as a component of a larger query. The MySQL server already supports some forms of this technique, such as INSERT INTO ... SELECT ..., and this support will be expanded in version 4.1 to include nested SELECT queries, which is one of the most-requested features from our users.
Galahad
Forum Contributor
Posts: 111
Joined: Fri Jun 14, 2002 5:50 pm

Post by Galahad »

Ah, well thanks for the help grooou. At least I know I'm not doing something stupid. Does anyone know if it is possible to do what I want to in MySQL?
User avatar
Stoker
Forum Regular
Posts: 782
Joined: Thu Jan 23, 2003 9:45 pm
Location: SWNY
Contact:

Post by Stoker »

hmm that is odd, I would have thought that your SELECT statement with that grouping should have worked, would be interresting to see an explanation of why it won't.. What mysql version do you use?

if you have multiple ID's in the database, is it all random which record is mixed or is it always the first found with the max of the last/highest ?
ayron
Forum Newbie
Posts: 14
Joined: Tue Jun 03, 2003 11:18 pm
Location: Perth, Australia

Post by ayron »

one workaround would be to get all the records of that person and then use php to determine the most recent version
User avatar
grooou
Forum Newbie
Posts: 8
Joined: Tue Jun 03, 2003 1:21 pm
Location: Portugal

Post by grooou »

a simple solution is to get all version's form the specific id with inverted ORDER BY, and fetch only the first row.

Code: Select all

SELECT id, version, firstname, lastname
FROM people
WHERE id =  '0'
ORDER BY version DESC
User avatar
Stoker
Forum Regular
Posts: 782
Joined: Thu Jan 23, 2003 9:45 pm
Location: SWNY
Contact:

Post by Stoker »

... it really shouldnt need any workaround , in my opinion in should work like it is, and I would be very interrested in knowing why it doesn't...
User avatar
phpScott
DevNet Resident
Posts: 1206
Joined: Wed Oct 09, 2002 6:51 pm
Location: Keele, U.K.

seems to work

Post by phpScott »

I did a quick bit of experminting on one of my db's and it allowed me to have multiple group by statements

so try this
SELECT Id, MAX(Version), FirstName, LastName FROM people WHERE Id='0' GROUP BY Id, Version;
phpScott
Galahad
Forum Contributor
Posts: 111
Joined: Fri Jun 14, 2002 5:50 pm

Post by Galahad »

I have worked around it using multiple queries and some work in php. Kind of disappointing since it seems like I ought to be able to do it all in MySQL, so I would still like to figure this out.

phpScott, that is a pretty good idea. However, when I try it, it still outputs both rows. I tried to do:

Code: Select all

SELECT Id, MAX(Version), FirstName, LastName FROM people WHERE Id='0' GROUP BY Id ORDER BY Version DESC;
I thought that might get it to return the new name instead of the old name, but it still returns the old name with the new version number (just like in a previous post of mine).

Stoker, as I said in a previous post, I'm running MySQL 4.0.13.
User avatar
Stoker
Forum Regular
Posts: 782
Joined: Thu Jan 23, 2003 9:45 pm
Location: SWNY
Contact:

Post by Stoker »

I did some tests on the subject and some reading, and it looks like if you can not include columns that are not agregated or grouped.. It is written surprisingly little about this in howtos and documentation.. Trying the same in Postgresql gives an error indicating that you can not include columns not grouped nor aggregated.. It makes sense in the cases of where you have max(col) and min(col) in the same query, or when avg() is used ..

It would be good to get this confirmed/commented by experienced SQL'ers/dba's....

so in your case the right way of doing it is either subquery or with <4.1 mysql do two queries, first get all the keys with (assuming yoru PK is composed of id and version)

SELECT id,MAX(version) FROM person GROUP BY (id)

and then look up the rest of the data with that result with another query..
Rob the R
Forum Contributor
Posts: 128
Joined: Wed Nov 06, 2002 2:25 pm
Location: Houston

Post by Rob the R »

Stoker is exactly right. Since your version of MySQL does not support subqueries (mine doesn't either - it's very annoying), the only way to do what you want is to do what Stoker has suggested. Yes, it's a pain to have to do run a new query for each row returned from the main query, but you just can't do it all in one query.
User avatar
Stoker
Forum Regular
Posts: 782
Joined: Thu Jan 23, 2003 9:45 pm
Location: SWNY
Contact:

Post by Stoker »

no need for a new query for each row returned in the first one, just create a WHERE with many OR's (A lot faster than multiple queries)
Post Reply