Wanting To Optimize This Algorithm

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
supermike
Forum Contributor
Posts: 193
Joined: Tue Feb 28, 2006 8:30 pm
Location: Somewhere in the Desert, USA

Wanting To Optimize This Algorithm

Post by supermike »

Here's what I need to do in a nutshell. I'm under NDA and can't describe the client's request, and this is only one tiny piece of it that's bothering me -- one of the last few pieces.

Let's say you have a carton that holds 12 eggs. (T = 12, let's say.) And your job is to display as much of the different varieties in them as evenly as possible. Trouble is, you're only given a random bucket of eggs, each of them marked A, B, and C, and you may have in your bucket less than 12 eggs, more than 12 eggs, exactly 12 eggs, and also you may have like 14 As, 5 Bs and 18 Cs. Get it?

The only way I've figured out to solve this problem is to do it like so:

1. If A + B + C <= T, then show all of A, B, and C.
2. If not # 1, then do a loop:
a. Take 1 A if possible and put it in the carton.
b. Take 1 B if possible and put it in the carton.
c. Take 1 C if possible and put it in the carton.
d. Repeat.
e. Stop at any time when the carton hits T count of eggs.

Okay, but that gets to be processor expensive because let's say you have a gazillion users doing this in a minute -- wouldn't it be nice to get rid of that loop and use a few C functions and some Calculus, instead?

Here's the kludge I came up with:

Code: Select all

 
$nShowA = 0; $nShowB = 0; $nShowC = 0;
$nABucket = $nA; $nBBucket = $nB; $nCBucket = $nC;
if (($nA + $nB + $nC) <= $nTotal) {
    $nShowA = $nA;
    $nShowB = $nB;
    $nShowC = $nC;
} else {
    while (($nShowA + $nShowB + $nShowC) <= $nTotal) {
        if ($nABucket >= 1) {
            ++$nShowA;
            --$nABucket;
        }
        if (($nShowA + $nShowB + $nShowC) == $nTotal) {
            break;
        }
        if ($nBBucket >= 1) {
            ++$nShowB;
            --$nBBucket;
        }
        if (($nShowA + $nShowB + $nShowC) == $nTotal) {
            break;
        }
        if ($nCBucket >= 1) {
            ++$nShowC;
            --$nCBucket;
        }
        if (($nShowA + $nShowB + $nShowC) == $nTotal) {
            break;
        }
    }
}
 
[/size]

The end result is $nShowA, $nShowB, and $nShowC. The inputs were $nA, $nB, $nC, and $nTotal. Let's imagine that $nTotal = 12 in this case.

See how whacked out my while loop is with the if/then conditions? And you're probably thinking you can do this with some algebra and some calculus. Be my guest -- I want to learn from you.



And the only thing I can say under NDA to explain what's really going on is that A, B, and C are types of records, and Total is how many records are possible to be displayed in the dashboard gadget before someone has to click View All to see the rest of the records.
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: Wanting To Optimize This Algorithm

Post by requinix »

So... if you wanted 12 from 14 A, 5 B, and 18 C, as evenly as possible, you'd come up with four of each, right?

12 over 3 is boring. Let's say T=29, A has 13, B has 8, C has 26, and D has 3.

Not calculus. No fancy equations or anything. If you don't have enough eggs, or just barely have enough, then the solution is obvious. Otherwise:
- The best spread would be an average of 7.25. But we can't cut eggs so we'll have to settle for 7 and deal with the remainder (1) later.
- D has to be handled immediately because it doesn't have 7 to give. So we take its 3.
- Since the data has changed (need 26 now and only 3 groups) we have to recalculate. Average is now 8 with remainder 2.
- All three groups can serve the 8 required, so now we just have 2 left.
- Since we have 3 groups and need only 2 eggs, just pull one egg from each group until we have as many as we need.
- Except we just took all of B's eggs just a minute ago so we actually only have A (5) and C (18). Take one from each.
The result is 3 (D) 8 (B) 9 (A) 9 (C).

Code: Select all

function f($need, $have) {
    $keep = array();
    
    // if we can barely serve the demand (or if we just plain can't) then
    // we already have the solution
    if ($need >= array_sum($have)) return $have;
 
    // otherwise we have to dish out what we can as evenly as possible
    $average = floor($need / count($have)); // "ideal" spread, without remainders
    asort($have); // neediest come first
    foreach ($have as $group => $number) {
        if ($number <= 0) { // they're dead, no point
        } else if ($number < $average) { // they don't ask for much: give it to them
            // we don't have to consider them anymore (they're happy now)
            // so forget about them. now our projections were slightly off -
            // recalculate (remember to actually give those guys their stuff though)
            unset($have[$group]);
            $keep = array_merge(array($group => $number), f($need - $number, $have));
            // calculations finished, we're practically done
            break;
        } else { // they want too much: give them the smallest amount and move on
            $keep[$group] = $average;
            $have[$group] -= $average;
        }
    }
    // just make sure we're not stiffing people
    $total = array_sum($keep);
    if ($total < $need) { // there's still some demand out there
        // spread the rest out over the remaining people - first come first serve
        foreach ($have as $group => $number) {
            if ($number <= 0) continue; // still dead
            $keep[$group]++;
            $total++; if ($total == $need) break;
        }
    }
    return $keep;
}
 
$need = 29;
$have = array("A" => 13, "B" => 8, "C" => 26, "D" => 3); // more entertaining
$keep = f($need, $have);
 
print_r($keep);
supermike
Forum Contributor
Posts: 193
Joined: Tue Feb 28, 2006 8:30 pm
Location: Somewhere in the Desert, USA

Re: Wanting To Optimize This Algorithm

Post by supermike »

That is simply fascinating. I'm still studying it and thinking.

Say, I'm trying $need numbers and get some odd results, but in whole the algorithm works.

$need = 2 --> you'll see that A stays 0, but then B and D get 1 each. I would have thought it would be A = 1, B = 1, C = 0, D = 0.

$need = 13 --> you'll see A = 3, B = 4, C = 3, D = 3 instead of what I might expect of A = 4, B = 3, C = 3, D = 3.

But in general, yes, this works well.

However, although it's an interesting read, there's a downside:

- 2 for loops
- recursion -- I heard that sometimes this is expensive in some cases? can anyone confirm in this case?
- several core PHP functions instead of just a few: array_sum (called twice (and then more because there's recursion)), floor, count, array_merge

If there's any opportunity to reduce loop count and less core PHP functions, and less lines, that would be even better.
User avatar
requinix
Spammer :|
Posts: 6617
Joined: Wed Oct 15, 2008 2:35 am
Location: WA, USA

Re: Wanting To Optimize This Algorithm

Post by requinix »

supermike wrote:$need = 2 --> you'll see that A stays 0, but then B and D get 1 each. I would have thought it would be A = 1, B = 1, C = 0, D = 0.
The asort in there sorts everything from least to most. So the ones with the fewest "eggs" come before those with more. D has the least, then B, then A, then C.

Say you have a very small window. If you look out, would you rather see the trees and houses and cars, or see the aliens that came to invade? It's a matter of uncommon having precedence over common.
supermike wrote:$need = 13 --> you'll see A = 3, B = 4, C = 3, D = 3 instead of what I might expect of A = 4, B = 3, C = 3, D = 3.
Same reason as above (remember: D had only 3).
supermike wrote:- 2 for loops
But they aren't nested. That's what matters.
supermike wrote:- recursion -- I heard that sometimes this is expensive in some cases? can anyone confirm in this case?
In some cases, yes. But for something as simple as this it's not a problem (the recursion level is proportional to the number of groups, not the number of "eggs").
supermike wrote:- several core PHP functions instead of just a few: array_sum (called twice (and then more because there's recursion)), floor, count, array_merge
Core functions are always faster than handmade functions (ie, the stuff we write).


If you really care about efficiency that much (which, by the way, means PHP isn't the best choice here) then it can be improved a little:
- counter on $keep instead of the final array_sum
- the recursion can be removed
but you're not gaining a lot. Really. Unless you're expecting millions of hits to this code every day it doesn't matter.

PS: My educated guess at complexity is O(n log n). The recursion would normally make it O(n^2) but the calling depth isn't directly proportional to group count.
PPS: You won't be able to make O(n) for this type of problem.
Post Reply