Page 1 of 1
flat file storing
Posted: Mon Oct 09, 2006 3:29 am
by s.dot
Are there any pros and cons to the following methods of storing and retrieving data from flat files?
Method 1
Store data line by line, use file() to get contents into an array.
Method 2
Store a serialized array of your data, use unserialize() to bring it back into an array
I like method 2 because of the easyness of calling the correct portion of the file without using line numbers.
Eg:
$file[57] is harder to know (and memorize) than $file['config_setting']
Posted: Mon Oct 09, 2006 3:35 am
by Chris Corbyn
Line by line (or any delimited string) has a major flaw in that you're restricted to what you can store. A decent flat-file system will use a B-Tree with fixed block sizes to store data -- the expense being less efficient use of space. The advantage being, quick searching/indexing.
Serialized data solves some problems but memory usage may just finish up being off the scale as the amount of data stored increases.
Posted: Mon Oct 09, 2006 3:41 am
by s.dot
Ah, I've never heard of the B-Tree system before. I will have to google it and see first hand the pros/cons of that to serializing. Thanks!
Posted: Mon Oct 09, 2006 3:44 am
by Maugrim_The_Reaper
Have you considered the pros/cons of using XML out of interest? Not saying use it, just wondering why it wasnt mentioned.

Posted: Mon Oct 09, 2006 3:47 am
by s.dot
Maugrim_The_Reaper wrote:Have you considered the pros/cons of using XML out of interest? Not saying use it, just wondering why it wasnt mentioned.

Inexperience in that area.

I guess that would be a good time to check that out as well.

Posted: Mon Oct 09, 2006 3:48 am
by Chris Corbyn
scottayy wrote:Maugrim_The_Reaper wrote:Have you considered the pros/cons of using XML out of interest? Not saying use it, just wondering why it wasnt mentioned.

Inexperience in that area.

I guess that would be a good time to check that out as well.

SimpleXML makes it a doddle

Posted: Mon Oct 09, 2006 3:56 am
by s.dot
Well, XML seems like it would be great for writing and retrieving. But what about manipulating? If the data is in an array, I could just do...
Code: Select all
$file = unserialize('file.txt');
//change their first name, but keep all of the rest of the data
$file['first_name'] = 'bob';
$file = serialize($file);
//write data
[edit] Arrays would make for easy pagination, too.
Posted: Mon Oct 09, 2006 4:05 am
by Chris Corbyn
scottayy wrote:Well, XML seems like it would be great for writing and retrieving. But what about manipulating? If the data is in an array, I could just do...
Code: Select all
$file = unserialize('file.txt');
//change their first name, but keep all of the rest of the data
$file['first_name'] = 'bob';
$file = serialize($file);
//write data
[edit] Arrays would make for easy pagination, too.
So is this just a small amount of data you're storing?
FYI: SimpleXML turns XML into an array, and you can write that array back out as XML
B-Tree allows you to search for records without actually needing to load the entire file into memory (it seeks in fixed sizes). I wouldn't read too much into the B-Tree; it's a complicated system used by actual database engines. There is a B-Tree implementation in PHP already called PHPBTree I think. I've read over the code before and it provides the essential insert and select routines. You'd have to build a wrapper/query language around it though.
Posted: Mon Oct 09, 2006 4:12 am
by s.dot
So is this just a small amount of data you're storing?
A potentially large amount of datas in lots of different directories and files. Although, I don't see any one file exceeding the length of say... a good sized forum post on here.
SimpleXML turns XML into an array, and you can write that array back out as XML
Ah. Well then, that doesn't make it much different than file()ing or unserialize()ing in the sense of data manipulation. I guess it would make it more "up to date" by using XML, and universal (although, i don't know of any system that doesn't have .txt files or can't be parsed by PHP

). Seems like it would just add a bunch of functions to learn, with little benefit. In my scenario anyways.
I'm still leaning towards unserializing. Memory usage can be dealt with by unset()ing files after being dealt with.
Unless I'm just being close-minded.

I'm going to test the different strategies tomorrow and see what I like best.
Posted: Mon Oct 09, 2006 11:07 am
by Maugrim_The_Reaper
If the data is small enough, XML is worth a shot... XML is fairly simple to learn. Of course XML in isolation although great can be improved. You can use XPath to search an XML string for individual elements and nodes.
For example:
Let say you have a simple name list. The XML root is "people" containing many "persons". Each "person" has a firstname and a lastname. An XML file might look like:
Code: Select all
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<people>
<person>
<firstname>John</firstname>
<lastname>Doe</lastname>
</person>
<person>
<firstname>Jane</firstname>
<lastname>Doe</lastname>
</person>
</people>
So far it's simple enough. Reading this file is done with SimpleXML:
Code: Select all
<?php
$xml = simplexml_load_file('people.xml');
// echo John's name
echo $xml->person[0]->firstname, ' ', $xml->person[0]->lastname, '<br/>';
// echo Jane's name
echo $xml->person[1]->firstname, ' ', $xml->person[1]->lastname, '<br/>';
// echo all names
foreach($xml->person as $person)
{
echo $person->firstname, '\'s last name is: ', $person->lastname, '<br/>';
}
// echo all firstnames only
foreach($xml->xpath('/people/person/firstname') as $firstname)
{
echo $firstname, '<br/>';
}
// make John a Patrick and display
$xml->person[0]->firstname = 'Patrick';
echo nl2br(htmlentities($xml->asXML()));
Freebie SimpleXML Tuorial 101...
