Page 1 of 1

How to optimize this setup?

Posted: Mon Dec 21, 2009 3:05 pm
by bncplix
Okay so as my mysql database increases, my speed drastically decreases.


So I have a table called follows. This table consists of id, userid and system id. All are int values.

Currently, this table has 2.5 million records, gaining anywhere between 30,000 and 200,000 a day.

I also have a table named System. This contains usernames, passwords as well as other fields such as "isvalid" and "followed"

So in my "processor" file, it would take in the current UserID of the person being processed. It would then select 5000 random usernames, passwords from SYSTEM where isvalid=1 and followed=0

It would then take the ID field from the selected data, and then perform a SELECT id FROM follows WHERE userid=USERID AND systemid=ID

So, if there is no record in the follows table, it allows it to go through. Otherwise, it will check the next one of the 5000.

You may be asking, why do it like this instead of a single joined query? The single query seemed to take about 5 minutes just to find the appropriate data, before even processing it. This way WAS faster, but as my database grows I am havign a problem.


I was suggested to add indexes to systemid, userid from follows, as well as username, password, isvalid and followed in the system table.

I was told writing data goes slightly slower, but its still faster than not using an index.

The follows table would get read from and writed to multiple times a second.
So using indexes write slower as they need to update the indexes each write.

My script is load balanced and can be run up to 4 times at once. What happens when they write at the same time and try to update the indexes at once?

It did seem to be a bit faster with just one running, but I need all 4 to be able to go, and run freely.

So is it not a good idea to use indexes with 4 processors having the potential to write at once?

Also, How would you suggest I optimize this?

Thanks for your time!

Re: How to optimize this setup?

Posted: Mon Dec 21, 2009 3:08 pm
by VladSun
Post your table CREATE scripts.

Re: How to optimize this setup?

Posted: Mon Dec 21, 2009 3:27 pm
by bncplix
VladSun wrote:Post your table CREATE scripts.
Is this what your talking about?:

Code: Select all

CREATE TABLE `follows` (
 `id` int(11) NOT NULL auto_increment,
 `userid` int(8) NOT NULL default '0',
 `systemid` int(8) NOT NULL default '0',
 PRIMARY KEY  (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=2951408 DEFAULT CHARSET=latin1
And:

Code: Select all

CREATE TABLE `system` (
 `id` int(10) NOT NULL auto_increment,
 `username` varchar(60) NOT NULL default '',
 `password` varchar(60) NOT NULL default '',
 `isvalid` int(2) NOT NULL default '1',
 `tweetsent` int(2) NOT NULL default '0',
 `followed` int(2) NOT NULL default '0',
 `checked` int(2) NOT NULL default '0',
 `haslist` int(2) NOT NULL default '0',
 `listid` int(11) NOT NULL default '0',
 `listfollows` int(10) NOT NULL default '0',
 PRIMARY KEY  (`id`),
 KEY `username` (`username`,`password`,`followed`,`isvalid`)
) ENGINE=MyISAM AUTO_INCREMENT=108998 DEFAULT CHARSET=latin1

Re: How to optimize this setup?

Posted: Mon Dec 21, 2009 3:53 pm
by VladSun
You don't have any indexes created - that's why your join query is sooo slow ;)
Create indexes for userid and systemid in your follows table.

Then run your join query with EXPLAIN clause.

Re: How to optimize this setup?

Posted: Mon Dec 21, 2009 4:12 pm
by bncplix
VladSun wrote:You don't have any indexes created - that's why your join query is sooo slow ;)
Create indexes for userid and systemid in your follows table.

Then run your join query with EXPLAIN clause.
I removed them as I wasnt sure if writing at the same time woulc cause a problem.


How can i join it into 1 query? Im no good at this.

I need one query that would pretty much do this

Select a random NUMBER of fields FROM system WHERE the selected system ID is not already in the FOLLOWS table with the current USERID

I was told to use this before:

Code: Select all

SELECT 
       system.id
     , system.username
     , system.password
     , system.followed
     , system.isvalid
     , follows.userid
     , follows.systemid
  FROM system
  LEFT JOIN follows
    ON system.id = follows.systemid
   AND follows.userid = 2 WHERE system.followed=0 AND system.isvalid=1
 ORDER BY RAND()
 LIMIT 10
Scratch that. The above part is not viable for what I need to do!
But this dosnt work properly or it is just EXTREMLY SLOW

Any help on this side?


So would I have a problem if I have multiple things writing data to a table that uses indexes? As each time it writes, it has to refresh the index

Re: How to optimize this setup?

Posted: Mon Dec 21, 2009 4:23 pm
by Eran
ORDER BY RAND() is very slow. aside from that, you need to be hitting indexes as Vlad pointed out. Please post the EXPLAIN results on your query as he requested.

Re: How to optimize this setup?

Posted: Mon Dec 21, 2009 4:36 pm
by bncplix
pytrin wrote:ORDER BY RAND() is very slow. aside from that, you need to be hitting indexes as Vlad pointed out. Please post the EXPLAIN results on your query as he requested.
Heres the PHP code to basically show how I have it setup:

Code: Select all

 
$result = mysql_query("SELECT id as systemid, username as username, password as password FROM system WHERE isvalid=1 AND followed=0 ORDER BY RAND() LIMIT 5000");
  $thecount = 0;
  $count=mysql_num_rows($result);
if ($count > 0){
//there are some to process
while($accounts = mysql_fetch_assoc($result))
{
    $systemuser = $accounts['username'];
    $systempassword = $accounts['password'];
    $systemid = $accounts['systemid'];
    $resultfollows = mysql_query("SELECT id FROM follows WHERE userid='$userid' AND systemid='$systemid'");
$count=mysql_num_rows($resultfollows);
if ($count > 0){
}else{
//so if there isnt one there already, we can use it
//validate the username and password, if its not right, then we dont increment the number used, and one of then ext ones will be chosen
 
}
}
Thats pretty much how it works, but i took out some of the validation code i have

I mainly need this answered: If I have my table with over 2.5 millon records indexed, with multiple records being selected and inserted at the same time each SECOND, will I run into problems with the indexes being rebuilt?

Re: How to optimize this setup?

Posted: Thu Dec 24, 2009 4:29 am
by VladSun
Your solution is bad.
Try to "fix" your DB as I said above.

Re: How to optimize this setup?

Posted: Thu Dec 24, 2009 9:15 am
by daedalus__
i thought key was an alias for index?

Re: How to optimize this setup?

Posted: Thu Dec 24, 2009 3:47 pm
by VladSun
daedalus__ wrote:i thought key was an alias for index?
Yes, it is. But ...

[sql]... KEY `username` (`username`,`password`,`followed`,`isvalid`)[/sql]
there is only one INDEX defined and it's a multiple-column index - not needed in this case.

Re: How to optimize this setup?

Posted: Thu Dec 24, 2009 7:46 pm
by daedalus__
i saw that but i thought a snicker would be innappropriate. :)

Re: How to optimize this setup?

Posted: Thu Dec 24, 2009 8:56 pm
by bncplix
I have reworked how things work and it seems MUCH faster now.


Now I added a temp table, and all the new data is inserted from here. The original one has the indexes, and it is only read from. Every day at 3AM, all the data from the temp table is dumped into the main one and the temp is wiped. Everything seems to be top notch speed right now and it is back to dumping over 100,000 a day.


Thanks for the help.