Page 1 of 2

[Solved]Query to find within Group

Posted: Thu Mar 08, 2007 2:08 am
by waqas_punjabian
Hi all,

I need a query which will select that group which have not even a single PrimaryFlag='y'

Code: Select all

tbl_images
------------------------------------------------------------
imgID	ProductID	ImageFile	PrimaryFlag
------------------------------------------------------------
1		    1		     a			   n
2		    1		     b			   n
3		    1		     c			   y

4		    2		     d			   n
5		    2		     e			   n
6		    2		     f			   n

7		    3		     g			   y
8		    3		     h			   n
9		    3		     i			   n

---------------------------------------------------

The result should be like below.

Code: Select all

------------------------------------------------------------
imgID	ProductID	ImageFile	PrimaryFlag
------------------------------------------------------------
4		    2		     d			   n
5		    2		     e			   n
6		    2		     f			   n
------------------------------------------------------------
Thanks in advance,
regards
Waqas

Posted: Thu Mar 08, 2007 2:56 am
by Christopher

Code: Select all

SELECT * FROM tbl_images WHERE ProductID=2 AND PrimaryFlag='n';
Or

Code: Select all

$id = 2;
$sql = "SELECT * FROM tbl_images WHERE ProductID=$id AND PrimaryFlag='n'";

Posted: Thu Mar 08, 2007 2:59 am
by mikeq
arborint wrote:

Code: Select all

SELECT * FROM tbl_images WHERE ProductID=2 AND PrimaryFlag='n';
I hope you are joking

Posted: Thu Mar 08, 2007 3:04 am
by Christopher
mikeq wrote:I hope you are joking
No ... I just misread the question. :(

Posted: Thu Mar 08, 2007 3:43 am
by volka
not really sure about it but try

Code: Select all

SELECT
  t1.*
FROM  
  `tbl_images` as t1
LEFT JOIN  
  `tbl_images` as t2
ON
  t1.ProductID=t2.ProductID
  AND t2.PrimaryFlag='y'
WHERE
  t1.PrimaryFlag='n'
  AND Isnull(t2.PrimaryFlag)
You need an index on ProductID and one on PrimaryFlag to make this query not looking too bad.

Posted: Thu Mar 08, 2007 4:27 am
by waqas_punjabian
Thanks volka,

This query worked for me in my scenario. But it's a little slower. May be due to, not having Indexing.

And i am sorry arborint, I didn't mention to do it specifically for ProductID=2

It was a general query.

Thanks anyways,

regards

Waqas

Posted: Thu Mar 08, 2007 4:32 am
by volka
Without indexing you have lots of full table scans, very bad for performance.

Posted: Thu Mar 08, 2007 5:49 am
by mikeq
volka wrote:Without indexing you have lots of full table scans, very bad for performance.
really dependent on the distribution of the unique values within the field. In this case if most of the flags are 'n', say something like 90% and you are searching for all 'n' flags then an index is of no use here and the optimiser ends up doing a full table scan anyway.

I cant remember of the top of my head, but its something like if the optimiser sees that more than 33% of the records will contain a certain value it will just switch to a full table scan.

However if you wanted to get all 'y' flags and they only account for something like 10% of total records then an index would benefit in this case on the flag field.

Placing indexes on these types of fields can hinder performance.

Whatever query you write you should always run an explain plan against it.

Posted: Thu Mar 08, 2007 6:51 am
by waqas_punjabian
Thanks Volka,

Now i have applied Indexing on this table it really shown a better performance.

i got execution time for this query

Before Indexing : 0.8628
After Indexing : 0.0183

It means, it's almost 45 times faster.
Amazingly Gr888.

Can you tell me is there any disadvantage of Indexing as well ?

I got a clue from mikeg explanation that whenever a scenario exist for 10% : 90%, where we need to search those records whose existance is 10%, there we should use Indexes. Is it right?

mikeq Wrote:
Whatever query you write you should always run an explain plan against it.
mikeg, Please tell me, what do you mean by running an explain plan ? I need some explanation.

Posted: Thu Mar 08, 2007 8:31 am
by mikeq
you just put explain at the beginning of the query, this would need to be run at the command line or in phpmyadmin

Code: Select all

EXPLAIN SELECT * FROM table1 WHERE whatever = 1
This will output all the details of how the query is run, order of tables, indexes used, where clauses used, temporary tables etc

Can really help optimise your queries for better performance. As a database grows you should rerun the explain plans for queries used in your applications.

Have a read through the mysql manual.

Posted: Thu Mar 08, 2007 9:55 am
by volka
mikeq wrote:I cant remember of the top of my head, but its something like if the optimiser sees that more than 33% of the records will contain a certain value it will just switch to a full table scan.
I'd really like to see the source of that ;) Anyway, in this case it was obviously the right thing to do.

Posted: Thu Mar 08, 2007 11:44 am
by mikeq
volka wrote:
mikeq wrote:I cant remember of the top of my head, but its something like if the optimiser sees that more than 33% of the records will contain a certain value it will just switch to a full table scan.
I'd really like to see the source of that ;) Anyway, in this case it was obviously the right thing to do.
If you go onto the mysql site and download their "webinars", there are a couple on optimising, it was in one of them. Straight from the mysql(horses) mouth.

Although initially it may appear that adding an index is the correct thing to do, as I mentioned explain plans should be run regularly, as this situation may change as the tables grow.

If he currently has a table that has small amounts of data then indexing may help, but a year down the when he has 20,000 records with an 'n' flag that represents the majortiy of the data then the index will be of little benefit when searching for an 'n' flag.

But then the explain plan will tell you when it is using the index and when it is using a full table scan.

From here http://dev.mysql.com/doc/refman/4.1/en/ ... dexes.html

"Sometimes MySQL does not use an index, even if one is available. One circumstance under which this occurs is when the optimizer estimates that using the index would require MySQL to access a very large percentage of the rows in the table. (In this case, a table scan is likely to be much faster because it requires fewer seeks.) However, if such a query uses LIMIT to retrieve only some of the rows, MySQL uses an index anyway, because it can much more quickly find the few rows to return in the result."

It doesnt mention the percentage, but if you download the webinar it does.

Posted: Thu Mar 08, 2007 12:36 pm
by mikeq
here is another decent article http://www.quepublishing.com/articles/a ... 77652&rl=1

a quote from there
Consider column cardinality. The cardinality of a column is the number of distinct values that it contains. For example, a column that contains the values 1, 3, 7, 4, 7, and 3 has a cardinality of four. Indexes work best for columns that have a high cardinality relative to the number of rows in the table (that is, columns that have many unique values and few duplicates). If a column contains many different age values, an index will differentiate rows readily. An index will not help for a column that is used to record sex and contains only the two values 'M' and 'F'. If the values occur about equally, you'll get about half of the rows whichever value you search for. Under these circumstances, the index might never be used at all, because the query optimizer generally skips an index in favor of a full table scan if it determines that a value occurs in a large percentage of a table's rows. The conventional wisdom for this percentage used to be "30%." Nowadays the optimizer is more complex and takes other factors into account, so the percentage is not the sole determinant of when MySQL prefers a scan over using an index.

Posted: Thu Mar 08, 2007 1:22 pm
by mikeq
and a variation on volkas query

Code: Select all

SELECT
  t1.*
FROM 
  `tbl_images` AS t1
WHERE
  t1.productid NOT IN
  (SELECT t2.productid
   FROM tbl_images AS t2
   WHERE t2.primaryflag = 'y')

Posted: Thu Mar 08, 2007 2:39 pm
by volka
Ok, thanks to the optimizer my suggested index on PrimaryFlag doesn't hurt but is useless and superfluous. Still keep the index on ProductID.