Sorting an array alphabetically, human-style

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
CobraCards
Forum Newbie
Posts: 13
Joined: Fri Feb 03, 2006 1:40 pm

Sorting an array alphabetically, human-style

Post by CobraCards »

I have some code that pulls records from a mySQL database and displays them in alphabetical order. This works just fine as far as a computer is concerned, but to a human the results look out of whack. For example, take these four arbitrary phrases:

Lois Lane
Superman Robots
Superman, Man of Steel
The Fortress of Solitude

This is correct, character-by-character alphabetical order, but (most) humans don't do that.... I would say that articles (a, an, the) and punctuation are generally ignored, so the order I want is this:

The Fortress of Solitude
Lois Lane
Superman, Man of Steel
Superman Robots

How can I reorder an array like this?

Thanks!
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

You probably need to make a separate column in the table for the sort order. It could contain alternate text or simply a number, but you would sort on that column rather than the actual displayed text.
(#10850)
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

artificially, the array could be supplemented or surrogated with the text stripped of punctuation then passed through natcasesort(). Using those results, the records could be rearranged, however performing the sort in the database would be the fastest end-level performance in all likelihood.
User avatar
bokehman
Forum Regular
Posts: 509
Joined: Wed May 11, 2005 2:33 am
Location: Alicante (Spain)

Post by bokehman »

Here's one possible method:

Code: Select all

<?php 

$titles = array('Superman Robots', 'Lois Lane', 'Superman, Man of Steel', 'The Fortress of Solitude');
$noise_words = array('a', 'an', 'the');

order($titles, $noise_words);

function order(&$titles, $noise_words)
{
	$temp = $titles;
	foreach($noise_words as $k => $v)
	{
		$noise_words[$k] = preg_quote($v, '/');
	}
	$exp = '/\b('.implode('|', $noise_words).')\b/';
	foreach($temp as $k => $v)
	{
		$temp[$k] = trim(preg_replace('/[^\w\s]/', '', preg_replace($exp, '', strtolower($v))));
	}
	array_multisort($temp, $titles, SORT_STRING, SORT_ASC);
}

# test it
print_r($titles);

# prints: Array ( [0] => The Fortress of Solitude [1] => Lois Lane [2] => Superman, Man of Steel [3] => Superman Robots )

?>
timvw
DevNet Master
Posts: 4897
Joined: Mon Jan 19, 2004 11:11 pm
Location: Leuven, Belgium

Post by timvw »

Afaik is 'alphabetic' sorting locale dependend... Eg: In Finnish the characters V and W are treated the same...
User avatar
bokehman
Forum Regular
Posts: 509
Joined: Wed May 11, 2005 2:33 am
Location: Alicante (Spain)

Post by bokehman »

timvw wrote:Afaik is 'alphabetic' sorting locale dependend... Eg: In Finnish the characters V and W are treated the same...
I didn't see anything about Finnish in the original post; the examples were in English. Nevertheless the following takes the locale into account:

Code: Select all

function order(&$titles, $noise_words, $locale = array('esp'))
{
	setlocale(LC_ALL, $locale);
	$temp = $titles;
	foreach($noise_words as $k => $v)
	{
		$noise_words[$k] = preg_quote($v, '/');
	}
	$exp = '/\b('.implode('|', $noise_words).')\b/';
	foreach($temp as $k => $v)
	{
		$temp[$k] = trim(preg_replace('/[^\w\s]/', '', preg_replace($exp, '', strtolower($v))));
	}
	asort($temp, SORT_LOCALE_STRING);
	foreach(array_keys($temp) as $k)
	{
		$rtn[] = $titles[$k];
	}
	$titles = $rtn;
}
Post Reply