Paste the following into a file named: class.csv.php or whatever you want.
Code: Select all
/**
*
* This class provides a high level interface for managing records stored in CSV format.
*
* This class is capable of parsing CSV files with an arbitrary number of fields as long
* as the number of fields match the number of column names. This class is basically
* a wrapper around an associative array, with accessor/mutator functions as
* well as serialization and paginated listing functionality.
*
* Conventions used through this source text are as follows:
* - $args is a common parameter name which is an abbreviation for arguments. The parameter
* name typically is an associative array in the form: array('name'=>'value');
* where 'name' is the name of the column you wish to index and 'value' is the value you
* want to use, either in creating new records, matching existing records, etc.
*
* TODO:
* - Support for Unicode?
* - Column support, like SUM(column_name), ADD, DROP, INSERT column, etc...
*
* Requirements:
* - PHP 5.0.0 is required unless you provide an implementation of array_combine()
*
* @version 1.0.0
* @author Alex <nuweb1@hotmail.com>
* @license LGPL <http://www.gnu.org/licenses/lgpl.html>
* @copyright PCSpectra Apr.28.06
*
*/
class CxMyCSV{
/**
*
* The constructor is called to initialize the object.
*
* You instantiate this object to begin working with CSV files. Any single object is bound
* to a specific CSV format, therefore you should not load two files of differing column names
* using the same object instance. You can load more than one file with any object instance
* just not ttwo files which are completely different in column structure, names or count.
*
* @access public
*
* @param delim The string delimiter used to separate fields - default is comma
* @param arr_colnames An array used to explicitly assign field names to given columns
*
* @return void Constructors don't return values.
*
*/
function CxMyCSV($delim = ',', $arr_colnames = null) /* public */
{
// Check is user has explicitly defined column names - notify object if TRUE
if($arr_colnames !== null && count($arr_colnames)){
// The following member tells the class two things:
// 1) TRUE indicates we should check first line against names supplied in $arr_colnames
// 2) TRUE indicates we should *not* save column names when serializing array
$this->m_bExplicit = true;
$this->m_arrColNames = $arr_colnames;
}
$this->m_strCharDelim = $delim; // Standard CSV delimiter is comma
}
/**
*
* The load function is called to begin parsing an CSV file.
*
* You must call this function before calling any mutators and accessors
*
* @access public
*
* @param file_name The file name of the CSV file you wish to load.
*
* @return integer Returns the number of CSV records which were loaded
*
*/
function csvLoad($file_name) /* public */
{
$arr_records = null; // Associative array of CSV values
$fp = fopen($file_name, 'r');
flock($fp, LOCK_SH);
$buff = fread($fp, filesize($file_name));
flock($fp, LOCK_UN);
fclose($fp);
$arr = explode("\n", $buff); // We assume CR not CRLF?
$arr_colnames = explode($this->m_strCharDelim, $arr[0]); // Store first line in temp buffer
// Strip whitespace from associative names caused by delimiter spacing
array_walk($arr_colnames, array('CxMyCSV', '_trim'));
// Current row index - start at zero unless column names are supplied in ctor()
$idx_row = 0;
if(!$this->m_bExplicit){ // Column names weren't supplied, so use first line as definition
$this->m_arrColNames = $arr_colnames;
$idx_row++; // Ignore first line when iterating
}
else{ // Column names were supplied by client programmer
if(count(array_diff($arr_colnames, $this->m_arrColNames)) == 0)
$idx_row++; // Ignore first line when iterating as it's identical to column names provided by client
}
//
// Iterate array of records returned from loading file and convert into associative array
$cnt = count($arr);
for($i=$idx_row; $i<$cnt; $i++){
$arr_tmp = explode($this->m_strCharDelim, $arr[$i]);
// Ignore records whose field count is not equal to column name count
if(count($arr_tmp) != count($this->m_arrColNames)) continue;
// Strip whitespace from associative values caused by delimiter spacing
array_walk($arr_tmp, array('CxMyCSV', '_trim'));
$arr_records[] = array_combine($this->m_arrColNames, $arr_tmp);
}
$this->m_arrValues = $arr_records;
return count($arr_records);
}
/**
*
* The save function is called to serialize any changes to the object.
*
* You must call this function to persist any changes you have made to the
* object using it's mutators or accessors - or directly via it's member variable(BAD)!!!
*
* You MUST supply the file name as the object does not store the original in a data
* member for design reasons. I also found having to explicitly set the output file
* handy during debugging and considered the functionality possibly handy during
* production development.
*
* @access public
*
* @param file_name The file name of the CSV file you wish to load.
*
* @return bool This function should always return TRUE.
*
*/
function csvSave($file_name) /* public */
{
//
// Initialize buffer with column names
$buff = implode($this->m_strCharDelim, $this->m_arrColNames)."\n";
//
// Iterate array of records and write them to disk formatted as CSV
$cnt = count($this->m_arrValues);
for($i=0; $i<$cnt; $i++){
$arr_tmp = array_values($this->m_arrValues[$i]);
//
// Ignore records whose field count isn't equal to column name count
if(count($arr_tmp) != count($this->m_arrColNames)) continue;
$buff .= implode($this->m_strCharDelim, $arr_tmp)."\n";
}
$buff = rtrim($buff); // Chop off trailing carriage returns
$fp = fopen($file_name, 'w');
flock($fp, LOCK_EX);
fwrite($fp, $buff);
flock($fp, LOCK_UN);
fclose($fp);
return true;
}
/**
*
* The appendRow function is called to add a new record to the end of the array.
*
* This function is called when you wish to add a record onto the end of the array
* of existing records.
* The record information is passed in in the form: $obj->appendRow(array('name' => 'value'));
* where name is the column name and value is the column value. You do not have control over
* where in the associative array your record is inserted, thus it is appended instead.
*
* @access public
*
* @param args An associative array specifying name/value pairs of record you wish to add
*
* @return bool This function returns TRUE on success or FALSE on failure.
*
*/
function appendRow($args) /* public */
{
return $this->_setRow($args);
}
/**
*
* The updateRow function is called to update an existing record.
*
* This function does exactly as appendRow() but instead of adding a new record
* to the end of the array, you MUST supply a valid index for a record you wish to
* update.
*
* @access public
*
* @param args An associative array specifying name/value pairs of record you wish to add
*
* @return bool This function returns TRUE on success or FALSE on failure.
*
*/
function updateRow($idx, $args) /* public */
{
if($idx > count($this->m_arrValues)-1) return -1; // Make sure were within array bounds
return $this->_setRow($args, $idx);
}
/**
*
* The deleteRow function is called to delete an existing record.
*
* This function deletes only a single record at a time and re-calculates the array
* indicies, so our internal array can be indexed sequentially. The function accepts
* a single numeric index parameter which if not known directly, can be discovered
* using the findIndex() passing it your record match criteria.
*
* @access public
*
* @param idx The numeric index of the record you wish to remove.
*
* @return integer This function returns the index of the record removed.
*
*/
function deleteRow($idx) /* public */
{
if($idx > count($this->m_arrValues)-1) return -1; // Make sure were within array bounds
$this->m_arrValues[$idx] = false; // Mark array as no longer needed
$this->m_arrValues = array_values(array_filter($this->m_arrValues));
return $idx;
}
/**
*
* The deleteAllRows function is called to remove a series of records.
*
* This function allows you to remove more than one record at a time and is supplied
* an associative array of column names and values which to match against. Any records
* which match the search criteria are passed to deleteRow() which removes the record
* and re-calculates array indicies.
*
* @access public
*
* @param args An associative array specifying name/value pairs of record you wish to add
*
* @return void This function does not return a value.
*
*/
function deleteAllRows($args) /* public */
{
/*
We cannot use a greedy search with findIndex because it returns
a cache of matching record indicies and deleteRow() re-calculates
array indicies, so the cached array of indicies would lead us to
delete incorrect record indicies. So instead we just loop until
findIndex returns -1 (no matching records).
*/
$idx = $this->findIndex($args);
while($idx !== -1){
$this->deleteRow($idx); // Remove record and re-calculate indicies
$idx = $this->findIndex($args); // Calculate next matching index - if any...
}
}
/**
*
* The swapRows function is called to swap two rows in the array.
*
* This function is called when you want to swap two record locations internally.
* The usefulness of this function is arguably un-nessecary as listRows() the primary
* function by which you will retreive any data is often passed additional filtering
* criteria, such as sorting, etc. Thus rendering any swapping functionality somewhat
* useless. However this may be handy in situations where you wish to swap records
* physically on disk only, as this function directly manipulates the internal array,
* whereas listRows() returns a *copy* of the internal array filtered down according
* to criteria.
*
* @access public
*
* @param idx1 An numeric index of one of the records you wish to swap.
* @param idx2 An numeric index of one of the records you wish to swap.
*
* @return bool This function returns TRUE on success or FALSE if indicies are out of bounds
*
*/
function swapRows($idx1, $idx2) /* public */
{
// Make sure both indicies are within array boundaries
$cnt = count($this->m_arrValues) - 1;
if($idx1 > $cnt || $idx2 > $cnt) return false;
$arr1 = $this->m_arrValues[$idx1];
$arr2 = $this->m_arrValues[$idx2];
// Swap records and we should be good to go <!-- s:) --><img src=\"{SMILIES_PATH}/icon_smile.gif\" alt=\":)\" title=\"Smile\" /><!-- s:) -->
$this->m_arrValues[$idx1] = $arr2;
$this->m_arrValues[$idx2] = $arr1;
return true;
}
/**
*
* The listRows function is called to return a copy of the internal array.
*
* This function is called when you want to return a copy of the internal array and
* specify additional filtering to narrow your results. You can call this function
* without any filtering and it should return an exact copy of what currently exists
* in serialized state. Asides from the typicaly name/value arguments parameter, you also
* can optionally specify 2 more parameters which are useful in further filtering your results.
*
* These are:
* - filter is an associative array with special names to indicate additional filtering.
* + column_name: The name of the column which you wish to sort by
* + sort_order: Indicates sorting direction (ASC or DESC)
* + pull_limit: Limits the number of records returned
* + page_index: Indicates which page we should begin returning records
* - cb_func is a callback function, which when supplied offers you even greater control
* over which records are returned. You define the callback with a single parameter ($record)
* and you can then test that given record for any condition PHP allows.
* ie: if($record['column_name'] > 10000) return false;
*
* @access public
*
* @param args An associative array of arguments.
* @param filter An associative array containing specially named elements.
* @param cb_func A callback function which is passed the currently index record.
*
* @return bool This function returns TRUE on success or FALSE if indicies are out of bounds
*
*/
function listRows($args = null, $filter = null, $cb_func = null) /* public */
{
$arr_idx = $this->findIndex($args, true); // Find all matches as array of indicies
$arr_records = null; // Records which pass callback test
//
// Iterate array of indicies for matched records - applying optional callback
$cnt = count($arr_idx);
for($i=0; $i<$cnt; $i++){
$arr_tmp = &$this->m_arrValues[$arr_idx[$i]]; // Alias for easier handling
//
// Pass record to callback - if TRUE then add to list
if($cb_func !== null){
if($cb_func($arr_tmp))
$arr_records[] = $arr_tmp; // We add only those callback says are good to go
}
else
$arr_records[] = $arr_tmp; // We add every record
}
//
// Pass array of matching records as reference to sortRows()
if($filter !== null)
$this->sortRows($arr_records, $filter);
return $arr_records; // Return array of filtered records
}
/**
*
* The sortRows function is called to sort an associative array.
*
* You can call this function when you wish to sort any array which is in the format
* this class understands.
*
* Format:
* array(
* array('fname' => 'Alex'),
* array('lname' => 'Barylski'),
* );
*
* @access public
*
* @param arr An associative array containing a list of records
* @param filter An associative array specifying name/value pairs of how to filter
*
* @return void This function does not return any value.
*
*/
function sortRows(&$arr, $filter) /* public */
{
//
// Initialize filter variables locally for easier handling - assigning defaults if required
$column_name = $filter['column_name']; // Column name which to sort by - First column by default
$column_name = ((!empty($column_name) && isset($column_name)) ? $column_name : $this->m_arrColNames[0]);
$sort_order = $filter['sort_order']; // Direction of sort (ASC|DESC) - ASC by default
$sort_order = ((!empty($sort_order) && isset($sort_order)) ? $sort_order : 'ASC');
$pull_limit = $filter['pull_limit']; // Number of pulled results to list - 10 by default
$pull_limit = ((!empty($pull_limit) || isset($pull_limit)) ? $pull_limit : 10);
$page_index = $filter['page_index']; // Page index - Zero by default
$page_index = ((!empty($page_index) || !isset($page_index)) ? $page_index : 0);
// Sort array maintaining indicies using quick sort algorithm <!-- s:) --><img src=\"{SMILIES_PATH}/icon_smile.gif\" alt=\":)\" title=\"Smile\" /><!-- s:) -->
$arr = $this->_quickSort($arr, $column_name, $sort_order);
//
// Implement trivial pagination
$idx = ($page_index*$pull_limit); // Zero equals no pagination support required...
if($pull_limit != 0){ // If limit is ZERO don't bother paginating
$pull_limit = (($pull_limit+$idx) > count($arr)) ? count($arr)-$idx : $pull_limit;
$arr = array_splice($arr, $idx, $pull_limit);
}
}
/**
*
* The findIndex function is called to determine if any given record already exists.
*
* You can call this function directly to determine if any given record exists based on
* basic name/value comparisons. If all the name/value pairs you provide in $args are matched
* when iterating records, you will be returned the matching record(s) index or an array of
* indicies, depending on the value of $greedy.
*
* Notes: This function is a virtual function which can be overridden in derived classes which
* desire to offer more flexible searching/matching algorithms. For instance you could
* perhaps emulate a psuedo-SQL using nested arrays and thus conduct a far more comprehensive result.
*
* @access public
*
* @param args An associative array specifying name/value pairs of what to match
* @param greedy TRUE indicates function should keep searching even after a match, otherwise
* it returns immediately after the first match is made.
*
* @return array/scalar This function returns either an array of matched indicies or the index
* of a single matched record, depending on $greedy. If $greedy is TRUE
* the function will match every record possible and return array of matched
* indicies. If $greedy is FALSE the function will return a scalar index
* of the first matched record. On failure this function always returns -1
*
*/
function findIndex($args, $greedy = false) /* public virtual */
{
//
// If no arguments are supplied return indicies of all records
// this *REQUIRES* m_arrValues indicies being keep sequential so we
// need to make sure updating, deleteing, etc recalculates indicies...
$cnt = count($args);
if($cnt == 0) return range(0, count($this->m_arrValues)-1);
$idx = -1; // No index(es) found by default
$arr = &$this->m_arrValues; // Alias member as local for easier handling
$cnt = count($arr);
for($i=0; $i<$cnt; $i++){ // Iterate array of records
$flag = false; // No match by default
foreach($args as $key => $value){ // Iterate each client supplied name/value pair
$flag = ($args[$key] == $arr[$i][$key]) ? true : false;
}
if($flag){ // TRUE if above comparison of arbitrary name/value pairs were all matched
if($greedy){ // We keep checking array, otherwise stop after first match
if(!is_array($idx))
$idx = array(); // We need to explicitly tell PHP this is now an array
$idx[] = $i; // Push index of matched record onto stack
}
else{
$idx = $i; // We only need to record a scalar
break;
}
}
}
return $idx;
}
/**
*
* The _quickSort function is called by sortRows() to return an ordered array.
*
* This function implements the quick sort algorithm and is a slightly modified version
* of that found at the Wikipedia web site: http://en.wikibooks.org/wiki/Transwiki: ... ations#PHP
*
* Our current implementation is not as efficient as it could be, as the pivot is never
* randomized, but is always index: 0 which leads to unessecary sorting execution when
* the array is already sorted in the fashion you specify.
*
* @access protected
*
* @param arr An associative array.
* @param column_name Column name which we wish to sort by
* @param sort_order Direction in which we wish to sort by.
*
* @return bool This function returns TRUE on success or FALSE if indicies are out of bounds
*
*/
function _quickSort($arr, $column_name, $sort_order) /* protected */
{
$cnt = count($arr);
//
// Make sure there is even a point to sorting
if($cnt > 1){
$pp = $arr[0]; // Initialize pivot point - we could calculate a median for better overall performance...
$ll = array(); // List left of pivot
$lr = array(); // List right of pivot
//
// Iterate array partitioning into left/right arrays or lists
for($i=1; $i<$cnt; $i++){
if($sort_order == 'ASC'){ // Sort ascending
if($arr[$i][$column_name] <= $pp[$column_name])
$ll[] = $arr[$i];
else
$lr[] = $arr[$i];
}
else{
if($arr[$i][$column_name] >= $pp[$column_name])
$ll[] = $arr[$i];
else
$lr[] = $arr[$i];
}
}
// Recursively sort left/right lists
$ll = $this->_quickSort($ll, $column_name, $sort_order);
$lr = $this->_quickSort($lr, $column_name, $sort_order);
return array_merge($ll, array($pp), $lr);
}
else
return $arr;
}
/**
*
* The _setRow function is called by updateRow() and appendRow()
*
* This function accepts an arbitrary number of parameters in args which are
* name/value pairs which corresnd exactly to column names in array.
* This funciton does NOT validate the numeric indicies against the internal array
* this is left up to the caller to ensure valid indicies are given.
*
* @access protected
*
* @param args An associative array of name/value pairs for record.
* @param idx Index, when specified indicates which record to update - otherwise append
*
* @return bool This function returns TRUE on success or FALSE on failure
*
*/
function _setRow($args, $idx = -1) /* protected */
{
if(count($args) == 0 || !$this->_checkColumnNames($args))
return false; // No data provided or user supplied an invalid column name
/*
We use a *temp arguments* array because the order of serialized arguments is
important (parsing relies on fixed layout) and if the user doesn't define each
argument as is physically defined on disk, the class will serialize in incorrect
order as well, as the class simply iterates each field as outputs them as is.
Because of this, we need to take the assigned arguments and reconstruct the record
based on the column names array, as it's initialized at load() time using the column names line.
*/
$tmp_args = null;
//
// Assign arguments default NULL values when they are not set explicitly
// or if were updating an existing entry, use it's previous data instead.
$arr = &$this->m_arrColNames; // Alias member for easier handling
$cnt = count($arr);
for($i=0; $i<$cnt; $i++){
$key = $arr[$i];
if(!array_key_exists($key, $args)){
if($idx == -1)
$tmp_args[$key] = null; // Assign NULL to missing argument
else{
$value = $this->m_arrValues[$idx][$key]; // Get existing data
$tmp_args[$key] = $value; // Use existing value
}
}
else
$tmp_args[$key] = $args[$key];
}
//
// Determine if were creating a new entry or updating an existing one
if($idx == -1) // Create a new entry...
$this->m_arrValues[] = $tmp_args; // Assign arguments/values to array
else{ // Update an existing entry...
if($idx > count($this->m_arrValues)-1) return -1; // Make sure were within array bounds
$this->m_arrValues[$idx] = $tmp_args; // Re-Assign arguments/values to array
}
return true;
}
/**
*
* The _checkColNames function is called internally.
*
* This function is called when we need to verify the caller hasn't supplied any
* invalid or non-existant column names in the name index of an associative array.
* Becaue out internal array has a one to one relationship with our CSV file
* keeping this integrity is important.
*
* @access protected
*
* @param args An associative array of name/value pairs to validate
*
* @return bool This function returns TRUE on success or FALSE on failure
*
*/
function _checkColumnNames($args) /* protected */
{
//
// Make sure arguments doesn't use any column name not defined in column names array
foreach($args as $key => $value){
if(!in_array($key, $this->m_arrColNames)) return false;
}
return true;
}
//
// Quick and dirty static callback used in cleaning name/value pairs brought in from a file.
// this function is not worth of further documentation, because if you can't figure this out
// you shouldn't be allowed near a computer without assistance - or your helmut :p
function _trim(&$value) /* static protected */
{ $value = trim($value); }
/* private */ var $m_bExplicit; // Indicates we should treat first line as field data, not column names
/* private */ var $m_arrValues; // Array of fields values
/* private */ var $m_arrColNames; // Array of fields names
/* private */ var $m_strCharDelim; // Character used to separate values
}
?>
Code: Select all
Alex, Barylski, Male
Keith, Anderson, Male
Shawn, Maxwell, Male
Robert, Sokolewsky, Male
Jennifer, Brant, Female
Jamie, Alexiuk, Male
Mike, Anderson, Male
Wade, Lake, Male
John, Zytnyk, Male
Al, Barton, Male
Pete, McLean, Male
Brian, LaChance, Male
Dave, Beulieu, Male
Mike, Hettland, Male
Kristina, Creasey, Female
Brianne, Lane, Female
Sara, Golden, Female
Neil, Anderson, Male
Brian, Rarama, Male
Code: Select all
include('class.csv.php');
$csv = new CxSwiftCSV(',', array('fname', 'lname', 'sex'));
$csv->csvLoad('testdata.csv');
$filter = array(
'column_name' => 'lname',
'sort_order' => 'ASC',
'pull_limit' => 0,
'page_index' => 1
);
echo '<pre>';
print_r($csv->listRows(null, $filter));
echo '</pre><br><Br>';
$csv->deleteAllRows(array('lname' => 'Anderson'));
$filter = array(
'column_name' => 'lname',
'sort_order' => 'ASC',
'pull_limit' => 3,
'page_index' => 6
);
echo '<pre>';
print_r($csv->listRows(null, $filter));
echo '</pre>';
$csv->csvSave('testdata2.csv');
There is a feature I did not implement, this CSV class assumes CR and does so when outputting...however the code responsible for that is fairly easy to find, so if you need native Windows support...add it yourself until I get around to it...
The comments are a little messy and convoluted and the code could use some minor cleaning...
There is a TODO list in the module comment stub...but I'd also like to use Arborint's Tiny Unit Test script...so if anyone cares to set that up for me...as I've grown tired of this class already...
I would appreciate your time and efforts
One more thing incase you don't read the module comments...
This class depends on array_combine() so PHP 5 is required...unless you implement your own copy...
I am willing to bet someone has already wrote a PHP4 implementation of array_combine() so if you check the comments, i'm sure you can find one
Cheers