Help optimizing array comparison function...

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
tymlls05
Forum Commoner
Posts: 30
Joined: Tue Nov 01, 2005 1:30 pm

Help optimizing array comparison function...

Post by tymlls05 »

Each user has 2 sets of arrays with with 123 keys in each array. These arrays are stored in a table in MySQL. Each key has a numerical value ranging from 0-5. I am ranking users based on their array vs. others (each key compare to every other users). So for every user I have there are 226 keys challenging each users 226. These keys are not updated often after being set the first time so I could set certain rules like only running the challenge if the challenging user has actually updated their array.

What I am doing works fine on a small scale but I generated 100,00 users and upon generating the rankings it timed out after 30 seconds. I knew this would be a problem but i was antsy to get my algorithm for generating ranking in action. So now here I sit trying to figure out how to better store and compare the data while keeping it as close to real time as possible.
JakeJ
Forum Regular
Posts: 675
Joined: Thu Dec 10, 2009 6:27 pm

Re: Help optimizing array comparison function...

Post by JakeJ »

You might need to rethink your strategy. Talk to a statistician (maybe find a statistics geek in college) and see if there is a better way to perform the rankings than a direct comparison between each users ranking in each of the keys. I suspect there is but I couldn't even begin to tell you where to start.
User avatar
Benjamin
Site Administrator
Posts: 6935
Joined: Sun May 19, 2002 10:24 pm

Re: Help optimizing array comparison function...

Post by Benjamin »

Assuming the end result of your calculations simply generate an integer:

A vs B = 256
A vs C = 64
B vs C = 128

You can simply create an additional table:

Code: Select all

 
CREATE TABLE `scores` (
`user_id_a` INT UNSIGNED NOT NULL ,
`user_id_b` INT UNSIGNED NOT NULL ,
`score` INT NOT NULL
) ENGINE = InnoDB;
 
ALTER TABLE `scores` ADD UNIQUE `user_combination` ( `user_id_a` , `user_id_b` );
 
The table is InnoDB since it will have a significant number of records and so we desire row level locking. user_id_a and user_id_b are indexed together as type unique which will ensure that there is only one record for each user combination. Depending on the queries you run, you may need to index user_id_a and/or user_id_b separately as well.

The query to insert records could be something like:

Code: Select all

INSERT INTO scores (user_id_a, user_id_b, score) VALUES ('1','2','65') ON DUPLICATE KEY UPDATE score = VALUES(score);
So now to retrieve the scores of one user in relation to all of the others you can execute a simply query, which will be very fast, pretty much instantaneous.

Code: Select all

SELECT user_id_b, score FROM scores WHERE user_id_a = '1' ORDER BY score DESC
You'll need up populate this table by running comparisons each time a user saves their data. This process may take some time, however since you are only comparing a single user against every other user, rather than every user against every other user it should be somewhat quick. If in the end that does not work, you can flag accounts to be scored and score them using a cron job running in the background.

This implementation does have a few issues, ones that you'll need to sort out when creating your system.

Hope that helps.
mikosiko
Forum Regular
Posts: 757
Joined: Wed Jan 13, 2010 7:22 pm

Re: Help optimizing array comparison function...

Post by mikosiko »

without knowing how the ranking is determined is not easy to suggest some solution.
per example:
User A : 2,1,3,5,5 (just to show some of the elements)
User B : 3,0,5,5,1

in this case, how you determine which one has a higher rank?

either way, if the logic to determine the higher is clear and it is possible to implement some math function over each key to produce a factorized final number probably I'll implement something like this :
- Define a new field in the table...let say "Rank"
- Create table triggers (After Insert/update) to calculate the "Rank" over the record affected.
- The final ranking will be determined automatic without further calculations... just select the records order by "Rank"

just an idea
Miko
tymlls05
Forum Commoner
Posts: 30
Joined: Tue Nov 01, 2005 1:30 pm

Re: Help optimizing array comparison function...

Post by tymlls05 »

Thank you all for your replies. They have opened my eyes to some techniques I wouldn't have known of otherwise. The ranking is not based on score. There is no score for any one user.

Here is exactly what is happening...

Where keys 1,2,3,4,5,6 are unique id's within the array.

User One:
Array1([1]=>3 [2]=>4 [3]=>5 [4]=>2 [5]=>1 [6]=>5 )
Array2([1]=>2 [2]=>5 [3]=>5 [4]=>2 [5]=>3 [6]=>2 )

User Two:
Array1([1]=>3 [2]=>2 [3]=>1 [4]=>3 [5]=>4 [6]=>5 )
Array2([1]=>3 [2]=>4 [3]=>5 [4]=>2 [5]=>1 [6]=>5 )

The Comparison:
If User1Array1[1]==User2Array1[1] then $count++ to User1 & User2
If User1Array1[1]!=User2Array1[1] BUT if (User1Array2[1] or User2Array2[1]) plus or minus (User1Array1[1] or User2Array1[1]) then $count++ to User1 & User2
If neither are true. Regardless, count every key ($totalkeys) that was compared for the ranking.

So, after every key has been compared we simply find the percentage of User1 and User2 based on matching keys. $count/$totalkeys.

Let's say User1 & User2 came out to 12%. We have that saved, but I still have to find where user1 stands with each user registered. Then move on to user2 compared with everyone else, and so on, and so on, and yeah...

I hope this helps you all help me. I'll be more descriptive in my opening topic from now on.
Post Reply