Page 1 of 1

mp3 filename separators

Posted: Fri Dec 16, 2011 7:47 pm
by egg82
I'm building an mp3 player addon that will play any mp3 file that you find on the web (no uploads).
Part of this is the title, artist, and album name of the file. Usually files like this have id3 tags that have the track information in them.
Unfortunately a few things can happen:
1. The id3 tags could be wrong
2. The id3 tags could be missing

Sometimes the track information is in the title, separated by some kind of symbol (or symbols. ie: "01 - title - artist.mp3")
I can break up the file name (or use the tags) and get the information from a website called http://www.findthecd.com/ (which is easy with PHP)

So - in short - What I need are common separators (like " - ") used in mp3 filenames to denote the track information. I can't break a string with no point to break it at.

Re: mp3 filename separators

Posted: Sat Dec 17, 2011 12:01 am
by Christopher
What is the question? You need common separators?

Re: mp3 filename separators

Posted: Sat Dec 17, 2011 1:02 am
by egg82
yes. I just need to figure out where to split the file name (because the mp3 could have literally any title)
Obviously it's not possible to get them all, but if I could get 90-ish% of them, I can add some more as time goes on.

edit: Got the "search for correct metadata" done. Here's the PHP if anyone's interested. Not the greatest work in the world, but I had to re-invent the wheel while I was getting it to do what it's supposed to (their search engine SUCKS. It only really works with one field at a time)

Code: Select all

<?php
function unsan($string){
	$temparray = explode("!SERVER_QUESTION!", $string);
	$temptext = implode("?", $temparray);
	$temparray = explode("!SERVER_AMPERSAN!", $temptext);
	$temptext = implode("&", $temparray);
	$temparray = explode("!SERVER_HASH!", $temptext);
	$temptext = implode("#", $temparray);
	$temparray = explode("!SERVER_PERCENT!", $temptext);
	$temptext = implode("%", $temparray);
	$temparray = explode("!SERVER_PLUS!", $temptext);
	$temptext = implode("+", $temparray);
	$temparray = explode("!SERVER_OPEN!", $temptext);
	$temptext = implode("<", $temparray);
	$temparray = explode("!SERVER_CLOSE!", $temptext);
	return(implode(">", $temparray));
}
function san($array){
	$temptext = implode("!SERVER_SEPERATOR!", $array);
	$temparray = explode("?", $temptext);
	$temptext = implode("!SERVER_QUESTION!", $temparray);
	$temparray = explode("&", $temptext);
	$temptext = implode("!SERVER_AMPERSAN!", $temparray);
	$temparray = explode("#", $temptext);
	$temptext = implode("!SERVER_HASH!", $temparray);
	$temparray = explode("%", $temptext);
	$temptext = implode("!SERVER_PERCENT!", $temparray);
	$temparray = explode("+", $temptext);
	$temptext = implode("!SERVER_PLUS!", $temparray);
	$temparray = explode("<", $temptext);
	$temptext = implode("!SERVER_OPEN!", $temparray);
	$temparray = explode(">", $temptext);
	$temptext = implode("!SERVER_CLOSE!", $temparray);
	$temparray = explode("\r", $temptext);
	return(implode("", $temparray));
}

$artist = strtolower(strip_tags(unsan($artist)));
$artist = preg_replace('/\s*\([^)]*\)/', '', $artist);
$artist = trim(preg_replace("/[^a-z0-9\s]/", "", $artist));
$title = strtolower(strip_tags(unsan($title)));
$title = preg_replace('/\s*\([^)]*\)/', '', $title);
$title = trim(preg_replace("/[^a-z0-9\s]/", "", $title));
$album = strtolower(strip_tags(unsan($album)));
$album = preg_replace('/\s*\([^)]*\)/', '', $album);
$album = trim(preg_replace("/[^a-z0-9\s]/", "", $album));

$artist_array = array();
$title_array = array();
$album_array = array();

$song_artist = "";
$song_title = "";
$song_album = "";
if(isset($artist) and $artist != ""){
	$continue = true;
	$skip=1;
	while($continue == true){
		$readfile = file_get_contents("http://www.findthecd.com/?artist=".str_replace(" ", "+", $artist)."&skip=".$skip);
		if($readfile != false){
			$readarray = explode("</tr>", $readfile);
			if(count($readarray) > 1){
				for($i=1;$i<count($readarray)-1;$i++){
					$songarray = explode("</td>", $readarray[$i]);
					$song_artist = strtolower(strip_tags($songarray[0]));
					$song_artist = preg_replace('/\s*\([^)]*\)/', '', $song_artist);
					$song_artist = trim(preg_replace("/[^a-z0-9\s]/", "", $song_artist));
					$song_title = strtolower(strip_tags($songarray[1]));
					$song_title = preg_replace('/\s*\([^)]*\)/', '', $song_title);
					$song_title = trim(preg_replace("/[^a-z0-9\s]/", "", $song_title));
					if(isset($song_artist) and $song_artist != "" and isset($song_title) and $song_title != ""){
						if(isset($title) and $title != ""){
							if(strpos($song_title, $title) !== false){
								array_push($artist_array, $song_artist);
								array_push($title_array, $song_title);
								array_push($album_array, "");
							}
						}else{
							array_push($artist_array, $song_artist);
							array_push($title_array, $song_title);
							array_push($album_array, "");
						}
					}
				}
			}else{
				$continue = false;
			}
		}
		if($skip<100){
			$skip += 50;
		}else{
			$continue = false;
		}
	}
}

$song_artist = "";
$song_title = "";
$song_album = "";
if(isset($title) and $title != ""){
	$continue = true;
	$skip=1;
	while($continue == true){
		$readfile = file_get_contents("http://www.findthecd.com/?track=".str_replace(" ", "+", $title)."&skip=".$skip);
		if($readfile != false){
			$readarray = explode("</tr>", $readfile);
			if(count($readarray) > 1){
				for($i=1;$i<count($readarray)-1;$i++){
					$songarray = explode("</td>", $readarray[$i]);
					$song_artist = strtolower(strip_tags($songarray[0]));
					$song_artist = preg_replace('/\s*\([^)]*\)/', '', $song_artist);
					$song_artist = trim(preg_replace("/[^a-z0-9\s]/", "", $song_artist));
					$song_album = strtolower(strip_tags($songarray[1]));
					$song_album = preg_replace('/\s*\([^)]*\)/', '', $song_album);
					$song_album = trim(preg_replace("/[^a-z\s]/", "", $song_album));
					if(isset($song_artist) and $song_artist != "" and isset($song_album) and $song_album != ""){
						if((isset($artist) and $artist != "") and (isset($album) and $album != "")){
							if((strpos($song_artist, $artist) !== false) and (strpos($song_album, $album) !== false)){
								array_push($artist_array, $song_artist);
								array_push($title_array, $title);
								array_push($album_array, $song_album);
							}
						}elseif(isset($artist) and $artist != ""){
							if(strpos($song_artist, $artist) !== false){
								array_push($artist_array, $song_artist);
								array_push($title_array, $title);
								array_push($album_array, $song_album);
							}
						}elseif(isset($album) and $album != ""){
							if(strpos($song_album, $album) !== false){
								array_push($artist_array, $song_artist);
								array_push($title_array, $title);
								array_push($album_array, $song_album);
							}
						}else{
							array_push($artist_array, $song_artist);
							array_push($title_array, $title);
							array_push($album_array, $song_album);
						}
					}
				}
			}else{
				$continue = false;
			}
		}
		if($skip<100){
			$skip += 50;
		}else{
			$continue = false;
		}
	}
}

$song_artist = "";
$song_title = "";
$song_album = "";
if(isset($album) and $album != ""){
	$continue = true;
	$skip=1;
	while($continue == true){
		$readfile = file_get_contents("http://www.findthecd.com/?album=".str_replace(" ", "+", $album)."&skip=".$skip);
		if($readfile != false){
			$readarray = explode("</tr>", $readfile);
			if(count($readarray) > 1){
				for($i=1;$i<count($readarray)-1;$i++){
					$songarray = explode("</td>", $readarray[$i]);
					$song_artist = strtolower(strip_tags($songarray[0]));
					$song_artist = preg_replace('/\s*\([^)]*\)/', '', $song_artist);
					$song_artist = trim(preg_replace("/[^a-z0-9\s]/", "", $song_artist));
					$song_album = strtolower(strip_tags($songarray[1]));
					$song_album = preg_replace('/\s*\([^)]*\)/', '', $song_album);
					$song_album = trim(preg_replace("/[^a-z\s]/", "", $song_album));
					if(isset($song_artist) and $song_artist != "" and isset($song_album) and $song_album != ""){
						if(isset($artist) and $artist != ""){
							if(strpos($song_artist, $artist) !== false){
								array_push($artist_array, $song_artist);
								array_push($title_array, "");
								array_push($album_array, $song_album);
							}
						}else{
							array_push($artist_array, $song_artist);
							array_push($title_array, "");
							array_push($album_array, $song_album);
						}
					}
				}
			}else{
				$continue = false;
			}
		}
		if($skip<100){
			$skip += 50;
		}else{
			$continue = false;
		}
	}
}

$artist_unique = array_unique($artist_array);
$title_unique = array_unique($title_array);
$album_unique = array_unique($album_array);

echo("artist=".san($artist_array));
echo("&title=".san($title_array));
echo("&album=".san($album_array));
echo("&artist_unique=".san($artist_unique));
echo("&title_unique=".san($title_unique));
echo("&album_unique=".san($album_unique));

echo("&returned=true");
?>
'nother edit: All this does is return two sets of flash-friendly arrays. One array holds all of the information gathered from the website, the other removes duplicates. Flash will do this:
Take the no-duplicate array and count each occurrence in the "duplicate" arrays
Compare the results gained from these and take the one that pops up the most
Take the occurrence number and divide by the total and multiply by 100 (getting a "confidence" percentage)
Display all of the above and then play the song

Re: mp3 filename separators

Posted: Tue Dec 20, 2011 9:35 am
by egg82
*bump*
If anyone has a folder of mp3 files and could look through them to find some common ones, that would be great :)

Re: mp3 filename separators

Posted: Tue Dec 20, 2011 12:27 pm
by Weirdan
Individual folder contents won't necessarily help you. For example, I use '%artist%/[%year%] %album%/%track% - %title%' format, where slashes represent folder levels.

Re: mp3 filename separators

Posted: Tue Dec 20, 2011 1:35 pm
by egg82
that's where I simply hope for the best. The application streams the mp3 from online sources instead of your computer, so I just hope they have some kind of logic in either the title or the ID3 tags. If either exists, I can pull the rest of the information from the PHP file above. If not, the fields remain blank until the user manually enters the information (it's than saved in the DB so it pulls from that first)