How to optimize this setup?

Questions about the MySQL, PostgreSQL, and most other databases, as well as using it with PHP can be asked here.

Moderator: General Moderators

Post Reply
bncplix
Forum Newbie
Posts: 7
Joined: Sat Feb 07, 2009 10:14 am

How to optimize this setup?

Post by bncplix »

Okay so as my mysql database increases, my speed drastically decreases.


So I have a table called follows. This table consists of id, userid and system id. All are int values.

Currently, this table has 2.5 million records, gaining anywhere between 30,000 and 200,000 a day.

I also have a table named System. This contains usernames, passwords as well as other fields such as "isvalid" and "followed"

So in my "processor" file, it would take in the current UserID of the person being processed. It would then select 5000 random usernames, passwords from SYSTEM where isvalid=1 and followed=0

It would then take the ID field from the selected data, and then perform a SELECT id FROM follows WHERE userid=USERID AND systemid=ID

So, if there is no record in the follows table, it allows it to go through. Otherwise, it will check the next one of the 5000.

You may be asking, why do it like this instead of a single joined query? The single query seemed to take about 5 minutes just to find the appropriate data, before even processing it. This way WAS faster, but as my database grows I am havign a problem.


I was suggested to add indexes to systemid, userid from follows, as well as username, password, isvalid and followed in the system table.

I was told writing data goes slightly slower, but its still faster than not using an index.

The follows table would get read from and writed to multiple times a second.
So using indexes write slower as they need to update the indexes each write.

My script is load balanced and can be run up to 4 times at once. What happens when they write at the same time and try to update the indexes at once?

It did seem to be a bit faster with just one running, but I need all 4 to be able to go, and run freely.

So is it not a good idea to use indexes with 4 processors having the potential to write at once?

Also, How would you suggest I optimize this?

Thanks for your time!
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: How to optimize this setup?

Post by VladSun »

Post your table CREATE scripts.
There are 10 types of people in this world, those who understand binary and those who don't
bncplix
Forum Newbie
Posts: 7
Joined: Sat Feb 07, 2009 10:14 am

Re: How to optimize this setup?

Post by bncplix »

VladSun wrote:Post your table CREATE scripts.
Is this what your talking about?:

Code: Select all

CREATE TABLE `follows` (
 `id` int(11) NOT NULL auto_increment,
 `userid` int(8) NOT NULL default '0',
 `systemid` int(8) NOT NULL default '0',
 PRIMARY KEY  (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=2951408 DEFAULT CHARSET=latin1
And:

Code: Select all

CREATE TABLE `system` (
 `id` int(10) NOT NULL auto_increment,
 `username` varchar(60) NOT NULL default '',
 `password` varchar(60) NOT NULL default '',
 `isvalid` int(2) NOT NULL default '1',
 `tweetsent` int(2) NOT NULL default '0',
 `followed` int(2) NOT NULL default '0',
 `checked` int(2) NOT NULL default '0',
 `haslist` int(2) NOT NULL default '0',
 `listid` int(11) NOT NULL default '0',
 `listfollows` int(10) NOT NULL default '0',
 PRIMARY KEY  (`id`),
 KEY `username` (`username`,`password`,`followed`,`isvalid`)
) ENGINE=MyISAM AUTO_INCREMENT=108998 DEFAULT CHARSET=latin1
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: How to optimize this setup?

Post by VladSun »

You don't have any indexes created - that's why your join query is sooo slow ;)
Create indexes for userid and systemid in your follows table.

Then run your join query with EXPLAIN clause.
There are 10 types of people in this world, those who understand binary and those who don't
bncplix
Forum Newbie
Posts: 7
Joined: Sat Feb 07, 2009 10:14 am

Re: How to optimize this setup?

Post by bncplix »

VladSun wrote:You don't have any indexes created - that's why your join query is sooo slow ;)
Create indexes for userid and systemid in your follows table.

Then run your join query with EXPLAIN clause.
I removed them as I wasnt sure if writing at the same time woulc cause a problem.


How can i join it into 1 query? Im no good at this.

I need one query that would pretty much do this

Select a random NUMBER of fields FROM system WHERE the selected system ID is not already in the FOLLOWS table with the current USERID

I was told to use this before:

Code: Select all

SELECT 
       system.id
     , system.username
     , system.password
     , system.followed
     , system.isvalid
     , follows.userid
     , follows.systemid
  FROM system
  LEFT JOIN follows
    ON system.id = follows.systemid
   AND follows.userid = 2 WHERE system.followed=0 AND system.isvalid=1
 ORDER BY RAND()
 LIMIT 10
Scratch that. The above part is not viable for what I need to do!
But this dosnt work properly or it is just EXTREMLY SLOW

Any help on this side?


So would I have a problem if I have multiple things writing data to a table that uses indexes? As each time it writes, it has to refresh the index
Last edited by bncplix on Mon Dec 21, 2009 4:33 pm, edited 1 time in total.
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: How to optimize this setup?

Post by Eran »

ORDER BY RAND() is very slow. aside from that, you need to be hitting indexes as Vlad pointed out. Please post the EXPLAIN results on your query as he requested.
bncplix
Forum Newbie
Posts: 7
Joined: Sat Feb 07, 2009 10:14 am

Re: How to optimize this setup?

Post by bncplix »

pytrin wrote:ORDER BY RAND() is very slow. aside from that, you need to be hitting indexes as Vlad pointed out. Please post the EXPLAIN results on your query as he requested.
Heres the PHP code to basically show how I have it setup:

Code: Select all

 
$result = mysql_query("SELECT id as systemid, username as username, password as password FROM system WHERE isvalid=1 AND followed=0 ORDER BY RAND() LIMIT 5000");
  $thecount = 0;
  $count=mysql_num_rows($result);
if ($count > 0){
//there are some to process
while($accounts = mysql_fetch_assoc($result))
{
    $systemuser = $accounts['username'];
    $systempassword = $accounts['password'];
    $systemid = $accounts['systemid'];
    $resultfollows = mysql_query("SELECT id FROM follows WHERE userid='$userid' AND systemid='$systemid'");
$count=mysql_num_rows($resultfollows);
if ($count > 0){
}else{
//so if there isnt one there already, we can use it
//validate the username and password, if its not right, then we dont increment the number used, and one of then ext ones will be chosen
 
}
}
Thats pretty much how it works, but i took out some of the validation code i have

I mainly need this answered: If I have my table with over 2.5 millon records indexed, with multiple records being selected and inserted at the same time each SECOND, will I run into problems with the indexes being rebuilt?
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: How to optimize this setup?

Post by VladSun »

Your solution is bad.
Try to "fix" your DB as I said above.
There are 10 types of people in this world, those who understand binary and those who don't
User avatar
daedalus__
DevNet Resident
Posts: 1925
Joined: Thu Feb 09, 2006 4:52 pm

Re: How to optimize this setup?

Post by daedalus__ »

i thought key was an alias for index?
User avatar
VladSun
DevNet Master
Posts: 4313
Joined: Wed Jun 27, 2007 9:44 am
Location: Sofia, Bulgaria

Re: How to optimize this setup?

Post by VladSun »

daedalus__ wrote:i thought key was an alias for index?
Yes, it is. But ...

[sql]... KEY `username` (`username`,`password`,`followed`,`isvalid`)[/sql]
there is only one INDEX defined and it's a multiple-column index - not needed in this case.
There are 10 types of people in this world, those who understand binary and those who don't
User avatar
daedalus__
DevNet Resident
Posts: 1925
Joined: Thu Feb 09, 2006 4:52 pm

Re: How to optimize this setup?

Post by daedalus__ »

i saw that but i thought a snicker would be innappropriate. :)
bncplix
Forum Newbie
Posts: 7
Joined: Sat Feb 07, 2009 10:14 am

Re: How to optimize this setup?

Post by bncplix »

I have reworked how things work and it seems MUCH faster now.


Now I added a temp table, and all the new data is inserted from here. The original one has the indexes, and it is only read from. Every day at 3AM, all the data from the temp table is dumped into the main one and the temp is wiped. Everything seems to be top notch speed right now and it is back to dumping over 100,000 a day.


Thanks for the help.
Post Reply