csv library - a project I may actually finish

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

csv library - a project I may actually finish

Post by Luke »

I'm putting together a library to read and write csv files. I realize this is a pretty trivial task, but I've added some neat features. As far as design goes, I borrowed heavily from python's csv module. Here are some things that are already possible with the code I've written:

the simplest possible way to read a csv file

Code: Select all

$reader = new Csv_Reader('/path/to/myfile.csv');
foreach ($reader as $row) {
    // do something.. $row now contains an array of the current row
}
A few other examples

Code: Select all

$reader = new Csv_Reader('/path/to/myfile.csv');
echo count($reader); // outputs total lines
echo $reader->count(); // does the same thing
$header = $reader->current(); // grabs first row
while ($row = $reader->current()) { // starts from second row
    // do something
}
 
$dialect = new Csv_Dialect_Excel;
$reader = new Csv_Reader('/path/to/file', $dialect); // now reader will read excel csv file properly (this does not mean xls)
foreach ($reader as $row) {
    // do something
}
 
$dialect->delimiter = "\t";
$writer = new Csv_Writer('/path/to/file', $dialect);
foreach ($reader as $row) {
    $writer->writeRow($row);
}
// or you can do this
// $writer->writeRows($reader);
$writer->close(); // writes csv file but now it's tab-delimited
 
try {
    $reader = new Csv_Reader('/path/to/file');
} catch (Exception $e) {
    // could not open file
}
 
$dialect = new Csv_Dialect;
$dialect->quoting = Csv_Dialect::QUOTE_NONNUMERIC; // options are listed here: http://docs.python.org/lib/csv-contents ... v-contents
$dialect->delimiter = ";";
$dialect->lineterminator = "\r";
$dialect->quotechar = "'";
$dialect->escapechar = "\\";
$dialect->skipblanklines = false;
 
$reader = new Csv_Reader('/path/to/file', $dialect); // now this reader will read according to $dialect
$writer = new Csv_Writer('/path/to/file', $dialect); // now this writer will write according to $dialect
I'm just looking for advice on how to improve the API, what features would be useful etc. If you guys would like to see the code, I'll open the svn repo to the public.

A few things I'm considering:
providing a $reader->output() to force download (it would output the correct headers)
allowing Csv_Dialect->$quoting = Csv_Dialect::QUOTING_CUSTOM and then Csv_Reader::$quotecolumns = array(0,3,5,6)

I'm also like to allow the user to provide a mapping class to the reader object so that you could map columns to names and callback functions:

Code: Select all

 
$mapper = new Csv_Mapper();
$mapper->map(0, 'id'); // maps first column to id
$mapper->map(2, 'name');
 
$filter = new Csv_Filter();
$filter->addFilter('id', new Csv_Filter_digits());
$filter->addFilter('name', new Csv_Filter_ucWords());
$filter->addFilter('name', new Csv_Filter_alnum(true)); // argument == allowWhitespace
 
$reader = new Csv_Reader('/path/to/file');
$reader->addMapper($mapper);
$reader->addFilter($filter);
foreach($reader as $row) {
    printf("Your name is %s and your id is %d", $row['name'], $row['id']);
}
But I can't find a syntax I like.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: csv library - a project I may actually finish

Post by Christopher »

From my experience with importing data, it seems like some of the settings you need are:

- fields terminated with
- fields enclosed with
- fields escaped with
- lines terminated with
- array of field names (if not provided or different than names in first row)
- first row contains field names
- empty fields are to be filled with
- per field translation table or algorithm

It seems like those are what your Mapper contains/does. Sounds like the Mapper is a combination of setting properties and filter methods.

Can I have the code? ;)
(#10850)
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Re: csv library - a project I may actually finish

Post by Chris Corbyn »

I'd love to see the code for this too :) Striving for a better API seems like a never-ending cycle for me.... but you'll always be more critical of your own code than most other people will. Call it "programmer's OCD" :P

EDIT | Although you refer to this project as trivial I think it could be a great demonstration of good OO design if done well.
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Re: csv library - a project I may actually finish

Post by Luke »

Alright I am stepping out the door to go to dinner right now, but I will set up public svn access the moment I get home. Glad you guys are interested :)

When I post it, it'll be available here: http://svn.pleaseproof.com/cvs-utils/trunk/

I have written unit tests for it "tests/index.php", but I'm still pretty green when it comes to unit testing / simpletest. I could use help there too.
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Re: csv library - a project I may actually finish

Post by Luke »

I PMed you two a login... sorry I was having trouble making it public and I need to go... I'll make it public later. anybody else who wants the login just PM me. I am not 100% happy with my design, but it's in its infancy at the moment. I'd love any input you guys can provide. thanks guys!
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Re: csv library - a project I may actually finish

Post by Luke »

I have attached a tar file of a checkout of the latest revision at http://svn.pleaseproof.com/csv-utils/trunk for those of you who I didn't give a username / password. Feel free to check it out! It has accompanying unit tests in the /tests/ folder. You will need simpletest in order for the test suite to work.
Attachments
csv-utils.tgz
This is a tar file of a checkout of the latest svn revision located at http://svn.pleaseproof.com/csv-utls/trunk/
(480 KiB) Downloaded 37 times
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Re: csv library - a project I may actually finish

Post by Luke »

Here is the most updated version... mainly I just added better comments...
Attachments
csv-utils.tgz
(116.86 KiB) Downloaded 39 times
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Re: csv library - a project I may actually finish

Post by Luke »

nuthin? :(
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Re: csv library - a project I may actually finish

Post by Luke »

I just now posted a blog article about this library here:

http://www.mc2design.com/blog/php-csv-u ... csv-module
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: csv library - a project I may actually finish

Post by Christopher »

Blog post not working. I have not had the time to look through it -- but I will. After your first post I coded a quick implementation to just think about the idea, so I am interested to see the direction you took.
(#10850)
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Re: csv library - a project I may actually finish

Post by Luke »

I took down the blog post because I need to package the download better tonight. It will be back up around 8pm tonight. I am not that happy with the implementation at the moment... especially with the reader class. I'm workin' on it.
User avatar
arjan.top
Forum Contributor
Posts: 305
Joined: Sun Oct 14, 2007 4:36 am
Location: Hoče, Slovenia

Re: csv library - a project I may actually finish

Post by arjan.top »

I think this is a bit confusing:

Code: Select all

 
$header = $reader->current(); // grabs first row
while ($row = $reader->current()) { // starts from second row
    // do something
}
 
Why does $reader->current() change "pointer" to the next row?
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Re: csv library - a project I may actually finish

Post by Luke »

I completely agree... It is like that because I'm using PHP5's SPL Iterator interface and I am not sure I completely understand how it's supposed to work. So that's what I'm doing right now... getting a better understanding of that. I'm going to rewrite that class tonight.
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Re: csv library - a project I may actually finish

Post by Luke »

Alright I'm on lunch and I have some time to work on this. The reason I have current() advance the pointer is because I don't want to do this:

Code: Select all

$reader = new Csv_Reader();
$row1 = $reader->current();
$reader->next();
$row2 = $reader->current();
The only other way I can think of to avoid having to make two calls like that is this:

Code: Select all

$row1 = $reader->next();
$row2 = $reader->next();
But that doesn't make any sense either. Should I have a read() method that advances the pointer AND returns the current row? What makes the most sense?
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: csv library - a project I may actually finish

Post by Christopher »

The Ninja Space Goat wrote:Should I have a read() method that advances the pointer AND returns the current row?
Yes. current() does not advance, next() advances.
(#10850)
Post Reply