Making human readable, keyword rich strings for PHP 5
Posted: Mon Aug 06, 2007 3:16 am
Okay, so this is the new and improved php5 version of this class i made for php4.
The main purpose I use this class for is for making nice human readable keyword rich URLs (that the search engines love). However, there are other uses of this function, so i've dropped the class name safeurl and changed it to safe_string.
I've also added a property _delimiter, so you don't have to use hyphens (-) for the delimiter, you could use underscores, periods, or any other character.
This class will turn user generated strings or strings pulled from databases into strings stripped of everything but alphanumeric characters, separated by a delimiter.
Class safe_string Code
And, I went ahead and did some tests from the other topic, to ensure that this class produced the exact same results.
Test One
Test One Result
Test Two
Test Two Result
Real world project usage
My example of real world project usage would need a mod rewrite rule, which, together, I find helps me in search engine ranking positions.
The main purpose I use this class for is for making nice human readable keyword rich URLs (that the search engines love). However, there are other uses of this function, so i've dropped the class name safeurl and changed it to safe_string.
I've also added a property _delimiter, so you don't have to use hyphens (-) for the delimiter, you could use underscores, periods, or any other character.
This class will turn user generated strings or strings pulled from databases into strings stripped of everything but alphanumeric characters, separated by a delimiter.
Class safe_string Code
Code: Select all
<?php
/*
** This PHP5 class will turn strings that may be user generated,
** or pulled from a database, containing HTML or other special
** characters into strings that are human readable, keyword rich, and safe
** for passing as URLs. Very useful in addition with Apache's mod_rewrite module.
**
** --------------------------------------------------------------------------------
** This is updated from the PHP4 safeurl() class.
** Changes: All object properties are set to private.
** Delimeter can be chosen, instead of being forced to be a hyphen (-).
** The main fuction name make_safe_url() has been changed to just make_safe(), because
** there are many other valid uses for this class other than just using the strings as
** URLs.
** Class has been broken up into many methods each performing a specific task, rather
** than being thrown into one method.
** --------------------------------------------------------------------------------
**
** Author - << smp_info _at_ yahoo _dot_ com >>
** Date - Monday, August 6th, 2007
*/
class safe_string
{
/*
** Set this to false if your string has already been cleaned of entities
** @access private
** @bool $_decode
*/
private $_decode = true;
/*
** If $_decode is set to true, this will be the character encoding set that
** will be used to decode strings. Defaults to PHP's default of ISO-8859-1
** @access private
** @str $_decode_charset
*/
private $_decode_charset = 'ISO-8859-1';
/*
** Decides whether or not to leave the string how it is, or to lowercase all letters.
** Defaults to true, lowercasing of all letters.
** @access private
** @bool $_lowercase
*/
private $_lowercase = true;
/*
** If your string has HTML in it, this will strip it out. true = strip html,
** false = don't strip html
** @access private
** @bool $_strip
*/
private $_strip = true;
/*
** Sets the maximum length of characters in the returned string.
** @access private
** @bool $_maxlength
*/
private $_maxlength = 50;
/*
** Decides whether or not to chop the result string at the last whole word separated by
** $this->_delimiter.
** @access private
** @bool $_whole_word
*/
private $_whole_word = true;
/*
** Used as a delimiter between words. Can be any character.
** @access private
** @str $_delimiter
*/
private $_delimiter = '-';
/*
** Default string to use if no alphanumeric characters can be found in the string
** @access private
** @str $_blank
*/
private $_blank = 'no-title';
/*
** Container for our output string
** @access private
** @str $_output
*/
private $_output;
/*
** Method to decode the given string of entities
** @access private
*/
private function _decode_string()
{
if($this->_decode)
{
$this->_output = html_entity_decode($this->_output, ENT_QUOTES, $this->_decode_charset);
}
}
/*
** Method to lowercase the string
** @access private
*/
private function _lowercase_string()
{
if($this->_lowercase)
{
$this->_output = strtolower($this->_output);
}
}
/*
** Method to strip the string of html tags
** @access private
*/
private function _strip_string()
{
if($this->_strip)
{
$this->_output = strip_tags($this->_output);
}
}
/*
** Method to filter the string of invalid characters, replace &, spaces, and apostrophes
** and to replace multiple occurences of $this->_delimiter.
** @access private
*/
private function _filter_string()
{
//filter out invalid characters
$this->_output = preg_replace("/[^&a-z0-9_-\s']/i", '', $this->_output);
//replace &, spaces, and apostrophes with $this->_delimiter
$this->_output = str_replace(array('&', ' ', '\''), array(' and ', $this->_delimiter, ''), $this->_output);
//trim the string of $this->_delimiter, and replace multiple occurences of $this->_delimiter
$this->_output = trim(preg_replace("/" . preg_quote($this->_delimiter) . "{2,}/", $this->_delimiter, $this->_output), $this->_delimiter);
}
/*
** Method to chop the string to $this->_maxlength characters
** @access private
*/
private function _chop_string()
{
if(strlen($this->_output) > $this->_maxlength)
{
$this->_output = substr($this->_output, 0, $this->_maxlength);
$this->_whole_word_string();
}
}
/*
** Method to chop the string at the last whole word separated by $this->_delimiter
** @access private
*/
private function _whole_word_string()
{
if($this->_whole_word)
{
$this->_output = explode($this->_delimiter, $this->_output);
$this->_output = implode($this->_delimiter, array_diff($this->_output, array(array_pop($this->_output))));
}
}
/*
** Method that simply runs through the list of methods to prepare $this->_output
** @access private
** @param str $string
** @return str $this->_output
*/
private function _run($string)
{
$this->_output = $string;
$this->_decode_string();
$this->_lowercase_string();
$this->_strip_string();
$this->_filter_string();
$this->_chop_string();
return $this->_output;
}
/*
** Method to call the _run() method, and return $this->_output string
** @access public
** @param str $string
** @return string $this->_output
*/
public function make_safe($string)
{
return $this->_run($string);
}
/*
** Method to allow changing of private properties
** @access public
** @param str $property
** @param mixed $value
*/
public function __set($property, $value)
{
$this->$property = $value;
}
}Test One
Code: Select all
$safe_string = new safe_string();
$tests = array(
'i\'m a test string!! do u like me. or not......., billy bob!!@#',
'<b>some HTML</b> in <i>here</i>!!~',
'i!@#*#@ l#*(*(#**$*o**(*^v^*(e d//////e\\\\\\\\v,,,,,,,,,,n%$#@!~e*(+=t',
'A lOng String wiTh a buNchess of words thats! should be -chopped- at the last whole word'
);
foreach($tests AS $test)
{
echo $safe_string->make_safe($test) . '<br />';
}Code: Select all
im-a-test-string-do-u-like-me-or-not-billy-bob
some-html-in-here
i-love-devnet
a-long-string-with-a-bunchess-of-words-thatsCode: Select all
$safe_string = new safe_string();
//we'll change a few object properties
$safe_string->_lowercase = false;
$safe_string->_whole_word = false;
$tests = array(
'i\'m a test string!! do u like me. or not......., billy bob!!@#',
'<b>some HTML</b> in <i>here</i>!!~',
'i!@#*#@ l#*(*(#**$*o**(*^v^*(e d//////e\\\\\\\\v,,,,,,,,,,n%$#@!~e*(+=t',
'A lOng String wiTh a buNchess of words thats! should be -chopped- at the last whole word'
);
foreach($tests AS $test)
{
echo $safe_string->make_safe($test) . '<br />';
}Code: Select all
im-a-test-string-do-u-like-me-or-not-billy-bob
some-HTML-in-here
i-love-devnet
A-lOng-String-wiTh-a-buNchess-of-words-thats-shoulCode: Select all
$safe_string = new safe_string();
echo '<a href="blog/jimbob/12/' . $safe_string->make_safe($dba['blog_title']) . '.html">' . $dba['blog_title'] . '</a>';