How to optimize this setup?
Posted: Mon Dec 21, 2009 3:05 pm
Okay so as my mysql database increases, my speed drastically decreases.
So I have a table called follows. This table consists of id, userid and system id. All are int values.
Currently, this table has 2.5 million records, gaining anywhere between 30,000 and 200,000 a day.
I also have a table named System. This contains usernames, passwords as well as other fields such as "isvalid" and "followed"
So in my "processor" file, it would take in the current UserID of the person being processed. It would then select 5000 random usernames, passwords from SYSTEM where isvalid=1 and followed=0
It would then take the ID field from the selected data, and then perform a SELECT id FROM follows WHERE userid=USERID AND systemid=ID
So, if there is no record in the follows table, it allows it to go through. Otherwise, it will check the next one of the 5000.
You may be asking, why do it like this instead of a single joined query? The single query seemed to take about 5 minutes just to find the appropriate data, before even processing it. This way WAS faster, but as my database grows I am havign a problem.
I was suggested to add indexes to systemid, userid from follows, as well as username, password, isvalid and followed in the system table.
I was told writing data goes slightly slower, but its still faster than not using an index.
The follows table would get read from and writed to multiple times a second.
So using indexes write slower as they need to update the indexes each write.
My script is load balanced and can be run up to 4 times at once. What happens when they write at the same time and try to update the indexes at once?
It did seem to be a bit faster with just one running, but I need all 4 to be able to go, and run freely.
So is it not a good idea to use indexes with 4 processors having the potential to write at once?
Also, How would you suggest I optimize this?
Thanks for your time!
So I have a table called follows. This table consists of id, userid and system id. All are int values.
Currently, this table has 2.5 million records, gaining anywhere between 30,000 and 200,000 a day.
I also have a table named System. This contains usernames, passwords as well as other fields such as "isvalid" and "followed"
So in my "processor" file, it would take in the current UserID of the person being processed. It would then select 5000 random usernames, passwords from SYSTEM where isvalid=1 and followed=0
It would then take the ID field from the selected data, and then perform a SELECT id FROM follows WHERE userid=USERID AND systemid=ID
So, if there is no record in the follows table, it allows it to go through. Otherwise, it will check the next one of the 5000.
You may be asking, why do it like this instead of a single joined query? The single query seemed to take about 5 minutes just to find the appropriate data, before even processing it. This way WAS faster, but as my database grows I am havign a problem.
I was suggested to add indexes to systemid, userid from follows, as well as username, password, isvalid and followed in the system table.
I was told writing data goes slightly slower, but its still faster than not using an index.
The follows table would get read from and writed to multiple times a second.
So using indexes write slower as they need to update the indexes each write.
My script is load balanced and can be run up to 4 times at once. What happens when they write at the same time and try to update the indexes at once?
It did seem to be a bit faster with just one running, but I need all 4 to be able to go, and run freely.
So is it not a good idea to use indexes with 4 processors having the potential to write at once?
Also, How would you suggest I optimize this?
Thanks for your time!