Page 1 of 1
How could I accomplish this? (Logical Question)
Posted: Mon Dec 22, 2003 5:55 pm
by skeptikal
Hello everyone, first of all let me introduce myself as this is my first post here. I hope to make new friends and learn a lot from you guys. I have some experience in programming, but I'm not a professional programmer, just do it for fun. I just started in PHP, and I have a quick question that's more a "logical" then a "technical" one. If this is the wrong forum, please move my message where it suits better
Ok here it goes...
I have a database that keeps track of my audio files. Basically it stores the album name and size (in mbs). I'd like to make a small PHP script that would divide my albuns in smaller lists of about 693-697mb, so that I know exactly what I should record on a CD without the need to manually think what's the best fit to get the most out of my 700mb media.
I'd like to know your ideas about how this could be done - making the script "think" the best combination of albums so that I can record my CDs using their total capacity...
So, can you help me? Please let me know if I wasn't clear enough and I'll try to explain my idea better
Thanks in advance!
SKEPTiKAL
Posted: Mon Dec 22, 2003 10:52 pm
by uberpolak
Cool idea, I hope someone knows how, now I want to know too!
Posted: Tue Dec 23, 2003 1:59 am
by lazy_yogi
seach the web for the knapsack problem
Posted: Tue Dec 23, 2003 2:03 am
by microthick
yep, this is a typical kind of problem we learn in school in our data algorithms class. it's all about optimization (by weight, or space, etc.)
like just mentioned, knapsack problem is a good example and google should give you plenty of results.
OoooH
Posted: Tue Dec 30, 2003 3:46 pm
by mwong
That IS very interesting!
When your list is created.....I"m guessing you have tons of songs....how will you gather the files to burn them? I'm assuming your not going to want to hunt and pick out each and every one of them.
Cd1- "hmm....track one is.......here......track two is.............."
*shrug* maybe you could have the script seperate them into folders..... that would be cool.
---btw----
You songs are in Mp3 format....are you going to somehow calculate the WAV format? Or are they going to stay in the MP3 format.....
Posted: Mon Jan 05, 2004 3:39 pm
by skeptikal
Hello everyone, thanks for the answers. As I said, I'm not a pro in programming so I didn't have the opportunity to go to classes and study the "knapsack" problem

Thanks for the tip, I'll search for it as soon as I get settled (just got back from a trip). I'll let you now if I was successfull or not!
And some of you asked me questions, so here are the answers...
-- When your list is created.....I"m guessing you have tons of songs....how will you gather the files to burn them?
I separate the audio files in folders by album in the "Artist - Album" format, so when I add an album on my database it checks for the folder size in mbs, etc... The sorting part itself is done manually right now, maybe I'll think about how to do that after I get this one solved
-- You songs are in Mp3 format....are you going to somehow calculate the WAV format? Or are they going to stay in the MP3 format.....
I'm going to stick with the MP3 format for now... These albums will be burnt on data cds. Yes, I do have a lot of music stored on my computer!
Thanks for all your help and see ya soon!
SKEPTiKAL
Posted: Thu Jan 15, 2004 8:47 pm
by ilovetoast
The knapsack problem is over-complicated for this scenario.
Briefly, the knapsack has a capacity in units of weight. Each item has a weight AND a benefit different from that weight. The basic idea is to maximize the sum of benefits while maintaining a sum of weights less than the capacity of the knapsack. This requires to an iterative solution.
His situation is much simpler than the knapsack problem. In his case, the CD is the knapsack and has a capacity of 700Mb. The items are the individual songs which have a size (wieght). There is no benefit measure for each song separate from their size.
Furthermore his target capaacity is not written in stone, he specifies a range of 693-697 Mb as a target capacity, rather than the theoretical target of 700.
Just write a select (with or without additional non-size parameters). Keep track of the running total in a variable. Loop until you get into the target range. You will likely need a separate query to get the last couple songs.
To be resource efficient you'll want to limit the number of DB selects.
Start with a check of song size. Look up a max and a min size if you want or hard code a guess. If you look up the max and min, you can actually average them and then recheck to see if the mean is in fact the median. If its way off the median (caused by a few very large songs) you could adjust downward and check again...if you want. If you really want, you could get all of the sizes and actually get the true median.
Once you have a reasonable mean or median size (or a hardcoded guess), time to grab a hunk of songs. Shoot for say 650mb to start and grab some songs.
Total up the size and get the capacity range left. Then loop through a series of selects with successively lower max sizes. As long as your target capcity is a range and your songs have a variety of standard song sizes, the chance this will fail is small. If it does, just have it start over -- grab a different hunk to start with of course.
Depending on how complicated you want to get, you could put together a GUI that allows you to specify say a genre or group. It then spits back all the songs available. It makes an initial album guess using the method described. Then, it presents you the song list and a number of alternatives based on your genre/group parameters.
With a little javascript and some checkboxes/listboxes/textboxes, you could have it allow you to modify the intial album reccomendations to replace (while staying inside the size guide) and specify a track order.
peace
Posted: Thu Jan 15, 2004 9:07 pm
by lazy_yogi
"loop through a series of selects with successively lower max sizes"
The algorithm you're describing is a 'greedy solution'.
The knapsack one finds the optimal solution. And I'm sure it's a very short amount of coded needed.
Posted: Thu Jan 22, 2004 3:49 pm
by skeptikal
Hi everyone, thanks for keeping the answers coming! After searching the Internet during some days and finding only theorical resources about the knapsack problem (I was expecting to find at least some programmed examples to take a look at), I pretty much gave up on that idea because it seemed overcomplicated for a humble "one-hour-a-week" programmer
ilovetoast's solution seemed very doable to me, and I think I'll give a go at his idea on this next weekend. My only concern is about getting into an endless loop trying to find the best solution to my problem, but I believe that would happen in a minor number of times (at least I hope so!), so I guess it will be my way to go. If you have any other ideas about how I could accomplish this goal, please keep them coming, since I can try merging the best parts of your ideas into a nice solution!
Once again, thanks for the info!
PS: If any of you guys know of a site where I can find the source code of a knapsack-type problem, please let me know. That might be just what I need to pull this off. Thanks!
Best Regards,
SKEPTiKAL
Posted: Fri Jan 23, 2004 3:46 pm
by skeptikal
Hey everyone, I just did it! Using ilovetoast's idea with some changes, I managed to create 700mb packages quickly and painlessly. System resources are almost not used, and since I'll be running that at my own home server it's not a concern to me anymore.
Now I've been thinking, and it would be cool to set an MP3 file's ID3 tag when adding an entry to my database. Is that interaction between PHP and a particular MP3 file possible? I think it probably isn't, but who knows...
Best Regards,
SKEPTiKAL
Posted: Sun Jan 25, 2004 12:22 pm
by werlop
hey, wanna share your code with us?
we can all learn by others examples
Posted: Sun Jan 25, 2004 1:09 pm
by microthick
Posted: Sun Jan 25, 2004 9:16 pm
by skeptikal
microthick, thaks for that class! However, I already found one that I think it's better since it has the ability to both read and write id3v1, id3v2 and ape tags. In case anyone else wants it, here goes the URL:
http://www.getid3.org
As for the code, I'm posting it here... I decided to implement an all-random code - I know most of you will think this isn't the best solution, but it works and since a 700mb cd can hold only a few albums (around 10), the random loop won't be too long. This is an adapted code that picks the cd information from an array instead of a database. Also, I'm brazilian so the prints and variable names will probably be uninteligible to you

It may not be 100% functional, but there goes!
Code: Select all
<?php
$cds = array("101","71","105","49","52","65","95","101","209","86","60","56","54","35","70","101","94","73","64","69");
rsort($cds);
reset($cds);
$total_albums = count($cds);
$satisfeito = 0;
/* uncomment to output information about all albums... a simple array contents display loop :)
for ($i = 0; $i < $total_albums; $i++) {
print("#".$i." - ".$cdsї$i]." mb<br>");
}
*/
do {
$sorteados = "-";
$ts = 0;
for ($i = 0; $i < $total_albums; $i++) {
$sorteado = rand(0,($total_albums-1));
$valor = $cdsї$sorteado];
if (substr_count($sorteados,"-".$sorteado."-") == 0 && $ts + $valor < 698) {
$sorteados .= $sorteado . "-";
$ts += $valor;
}
}
if ($ts > 690 && $ts < 698) {
$satisfeito = 1;
}
} while ($satisfeito == 0);
print("<br>Total: ".$ts."<br>Álbuns: ".$sorteados);
?>
If you have any questions about this code, just ask and I'll be glad to help!!
Best Regards,
SKEPTiKAL[/php_man]