flat file storing

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

flat file storing

Post by s.dot »

Are there any pros and cons to the following methods of storing and retrieving data from flat files?

Method 1
Store data line by line, use file() to get contents into an array.

Method 2
Store a serialized array of your data, use unserialize() to bring it back into an array

I like method 2 because of the easyness of calling the correct portion of the file without using line numbers.

Eg:

$file[57] is harder to know (and memorize) than $file['config_setting']
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Post by Chris Corbyn »

Line by line (or any delimited string) has a major flaw in that you're restricted to what you can store. A decent flat-file system will use a B-Tree with fixed block sizes to store data -- the expense being less efficient use of space. The advantage being, quick searching/indexing.

Serialized data solves some problems but memory usage may just finish up being off the scale as the amount of data stored increases.
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

Ah, I've never heard of the B-Tree system before. I will have to google it and see first hand the pros/cons of that to serializing. Thanks!
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
Maugrim_The_Reaper
DevNet Master
Posts: 2704
Joined: Tue Nov 02, 2004 5:43 am
Location: Ireland

Post by Maugrim_The_Reaper »

Have you considered the pros/cons of using XML out of interest? Not saying use it, just wondering why it wasnt mentioned. ;)
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

Maugrim_The_Reaper wrote:Have you considered the pros/cons of using XML out of interest? Not saying use it, just wondering why it wasnt mentioned. ;)
Inexperience in that area. :oops: I guess that would be a good time to check that out as well. ;)
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Post by Chris Corbyn »

scottayy wrote:
Maugrim_The_Reaper wrote:Have you considered the pros/cons of using XML out of interest? Not saying use it, just wondering why it wasnt mentioned. ;)
Inexperience in that area. :oops: I guess that would be a good time to check that out as well. ;)
SimpleXML makes it a doddle ;)
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

Well, XML seems like it would be great for writing and retrieving. But what about manipulating? If the data is in an array, I could just do...

Code: Select all

$file = unserialize('file.txt');

//change their first name, but keep all of the rest of the data
$file['first_name'] = 'bob';

$file = serialize($file);

//write data
[edit] Arrays would make for easy pagination, too.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
Chris Corbyn
Breakbeat Nuttzer
Posts: 13098
Joined: Wed Mar 24, 2004 7:57 am
Location: Melbourne, Australia

Post by Chris Corbyn »

scottayy wrote:Well, XML seems like it would be great for writing and retrieving. But what about manipulating? If the data is in an array, I could just do...

Code: Select all

$file = unserialize('file.txt');

//change their first name, but keep all of the rest of the data
$file['first_name'] = 'bob';

$file = serialize($file);

//write data
[edit] Arrays would make for easy pagination, too.
So is this just a small amount of data you're storing?

FYI: SimpleXML turns XML into an array, and you can write that array back out as XML ;)

B-Tree allows you to search for records without actually needing to load the entire file into memory (it seeks in fixed sizes). I wouldn't read too much into the B-Tree; it's a complicated system used by actual database engines. There is a B-Tree implementation in PHP already called PHPBTree I think. I've read over the code before and it provides the essential insert and select routines. You'd have to build a wrapper/query language around it though.
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Post by s.dot »

So is this just a small amount of data you're storing?
A potentially large amount of datas in lots of different directories and files. Although, I don't see any one file exceeding the length of say... a good sized forum post on here.
SimpleXML turns XML into an array, and you can write that array back out as XML
Ah. Well then, that doesn't make it much different than file()ing or unserialize()ing in the sense of data manipulation. I guess it would make it more "up to date" by using XML, and universal (although, i don't know of any system that doesn't have .txt files or can't be parsed by PHP ;)). Seems like it would just add a bunch of functions to learn, with little benefit. In my scenario anyways.

I'm still leaning towards unserializing. Memory usage can be dealt with by unset()ing files after being dealt with.

Unless I'm just being close-minded. ;) I'm going to test the different strategies tomorrow and see what I like best.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
Maugrim_The_Reaper
DevNet Master
Posts: 2704
Joined: Tue Nov 02, 2004 5:43 am
Location: Ireland

Post by Maugrim_The_Reaper »

If the data is small enough, XML is worth a shot... XML is fairly simple to learn. Of course XML in isolation although great can be improved. You can use XPath to search an XML string for individual elements and nodes.

For example:

Let say you have a simple name list. The XML root is "people" containing many "persons". Each "person" has a firstname and a lastname. An XML file might look like:

Code: Select all

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<people>
	<person>
		<firstname>John</firstname>
		<lastname>Doe</lastname>
	</person>
	<person>
		<firstname>Jane</firstname>
		<lastname>Doe</lastname>
	</person>
</people>
So far it's simple enough. Reading this file is done with SimpleXML:

Code: Select all

<?php

$xml = simplexml_load_file('people.xml');

// echo John's name
echo $xml->person[0]->firstname, ' ', $xml->person[0]->lastname, '<br/>';

// echo Jane's name
echo $xml->person[1]->firstname, ' ', $xml->person[1]->lastname, '<br/>';

// echo all names
foreach($xml->person as $person)
{
	echo $person->firstname, '\'s last name is: ', $person->lastname, '<br/>';
}

// echo all firstnames only
foreach($xml->xpath('/people/person/firstname') as $firstname)
{
	echo $firstname, '<br/>';
}

// make John a Patrick and display
$xml->person[0]->firstname = 'Patrick';
echo nl2br(htmlentities($xml->asXML()));
Freebie SimpleXML Tuorial 101...;)
Post Reply