Going to implement Data Mapper - looking for resources

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Going to implement Data Mapper - looking for resources

Post by Luke »

I am going to be implementing a Data Mapper system in an application I'm putting together and I am just looking for the community's input on the subject before I do. I am also interested in any resources you guys can dig up. This is what I've found so far:
http://www.martinfowler.com/eaaCatalog/dataMapper.html
http://en.wikipedia.org/wiki/Data_mapping
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

I guess the first thing I would ask is "Do you really need to?" The reason for implementing a Mapper is that you want to create some structure of connected objects populated with data that is more complex than a list. Is that what you need to do?
(#10850)
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Post by Luke »

yes.

EDIT: on a side note I just looked through my old posts because I know I've asked about this before (several times actually) and it has always just sort of died, or I should say my will to build such a system died and I went with something I wasn't very happy with every time. Right now I am looking at a bunch of different orm solutions for php and not a single one of them looks like it fits my needs. I'd have to say the closest I've seen is phpdoctrine, but I haven't had a chance to mess with it or even fully research it yet.

I've also been looking at things like ezpdo and propel, and neither seem to fit my needs at all.

My solution: build me own.
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

My advice? Do it!

It's cheaper than a Rubik's cube... and more rewarding!

If you want to discuss more detailed aspects of data mapping, I'm totally up for it. I'm writing one as well, and have found some pitfalls you would do well to avoid.

Cheers,
Kieran
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Post by Luke »

such as...
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

such as...

using php objects as data mappers, I was trying to use the __destruct method to update the database... doesn't work properly (but sort of works... ugh)

trying to use the __construct method to populate the DOMElement's own attributes would only work after it was attached to the document.

design decisions involving whether to insist on ID's or allowing a unique "token" to select a resource, ho to specify your data concepts, how to allow non-standard properties, etc...

It's a BIG subject... Each one of these points could easily have it's own thread!

Maybe we should start threads for each as we need to, prefixing them with [datamap]? mods?

Cheers,
Kieran
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Post by Luke »

eh, I think one thread is enough. Well, care to share any design ideas?
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

Love to... later - must go last minute x-mas shopping now... beating down psychotic parents for the last furby or whatever's cool these days.

Cheers,
Kieran
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

I suggest building the lightest solution possible and then going from there. If you supply the mappings manually at first and do some of the processing manually then you can always add automation wrappers around that.
(#10850)
User avatar
johno
Forum Commoner
Posts: 36
Joined: Fri May 05, 2006 6:54 am
Location: Bratislava/Slovakia
Contact:

Post by johno »

I am trying and researching OR mapping stuff for nearly a year now. I already made some attempts to code a data mapper (Look at Torpedeo in my signature). Recently I've written an article about a solution I was working on at school. It's an aspect oriented approach in AspectJ, but never mind that. Just skip AspectJ code and read plain text.

If you look how Hibernate in Java works, I thinks its a really similar approach what I have done. Look at lastcraft's Changes, there are some really good ideas. Doctrine has some nice OQL language features, but last time I looked source code was too messy for me. RubyOnRails ActiveRecord has sweet mapping definitions.

In OR mapping there are tricky things you will need to resolve. Here are my conclusions after nearly a year of passive free time studies.
  • There are basically two high level design approaches I've seen. Explicit and implicit persisting. In explicit persisting you will need to call some $person->save(); or $mapper->save($person); $transaction->commit(); to actually trigger insert/updates to DB.

    In implicit persisting (this is what I am trying to do) you only work with objects. Whenever an persistable object is created/updated a INSERT/UPDATE is automatically triggered. More on this below.

    With explicit persisting you will have to pollute your code with mapper stuff, but you gain greater control. (The question is if you will ever need it) In implicit persisting everything is done automatically, but with special cases that mapper was not designed for will be at least hacky if not impossible.
  • In OO, object identity is defined by address in memory. Objects are identical if they point to same address. In relational database rows in table are identical if they have equal primary keys. You really need to solve this and it's going to be tricky for composite primary keys. Look for Fowlers IdentityMap pattern. Once you make this one, you will see a spot to skip some duplicated queries as a bonus. A tip: Transaction is a sweet spot for identity maps.
  • Object associations and collections are easy once you understand partially loaded objects (look in my article). There is an small problem with adding to collection without loading full collection before, but its pretty easy to fix.
  • Class inheritance is a little tricky, but it can be done with partially loaded objects and one class type column. You will probably don't need inheritance at first, but its good to know how to do it.
  • There are two types of object construction in context of OR mapping. New object creation and object load from DB. You absolutely need to distinguish between these two. In PHP this is tricky because you cannot have two different constructors. You will have to go for a custom method thats called instead of constructor.
  • Persistent objects will need to have a common super class from your mapper library. First reason is the point above, second one is that you will need to catch access to unloaded properties with some __get/__set magic and finally third reason is that you can't set values of private properties from outside. You will definitely need that. In Java its different because you can access/modify private fields through reflections.
  • There are two ways to do delete objects from database. Yes you have to delete them manually because you don't have a garbage collector. You can load them and mark them as deleted. Some mappers do it so, but I think its crazy. The second way is not to load them, but just to specify which ones you want to delete. Look at lastcrafts Description class in Changes library.
  • Class to table mapping definitions. Two types: Definition inside persistent class or outside. Many variants: Custom definition methods/classes(Doctrine, Torpedeo), annotations (Torpedeo, Hibernate, EJB3), tags(EZPDO), RoR ActiveRecord style, ...
There are also problems you will need to keep in mind.
  • First of all make sure that mapper is able to execute raw handmade SQL. Sometimes you just need raw power.
  • Selfreferencing tables (parent<->child) and circular references are tricky so keep these cases in mind when coding.
  • It's good to accumulate updates and maybe inserts in a sort of UnitOfWork(Fowler again) and commit them on transaction end. You will save some excessive query sending. Look in my article again for details.
  • Look out for concurrency. Just use transactions and you will be fine.
What else to say? Good luck with coding and keep asking.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

As you can see from johno's comments, people take off at a run pretty quickly when it comes to O/RM. A lot of good information there and a lot of implementation specific stuff too. It can be difficult to tell the difference unless you know a little about the subject. Here the implementation solution of track fields/properties that have been loaded (Partially Loaded Objects) drives the design.

I am not sure the distinction of explicit vs implicit given is very accurate either because there are a number of points within such a system that operations can be automatic or manual and for that matter immediate or deferred. I don't think a choice necessarily needs to be made because many of those decisions that you want the application to be able to make at run time -- the Mapper providing several behavior options.
(#10850)
Begby
Forum Regular
Posts: 575
Joined: Wed Dec 13, 2006 10:28 am

Post by Begby »

I wrote one of these awhile back, I don't use it all that much as its not all that flexible, however it was a really good learning experience. I made it so it will automatically retrieve the database schema and create the object, have it do data validation, draw forms.

The base class follows a composite pattern. Each class represents a table, and when you extend a class it means the classes are related in a one to one hierarchy. Each field is represented as an object that you can modify parameters on field by field.

For instance, lets say you have a person, and its represented as follows in the database along with some other tables that build on person....

Person
- personID - key
- Name - varchar
- Birhtdate - date

Employee
- empID - key
- personID - Foreign key
- hireDate - date
- position - varchar

Supervisor
- empID - foreign key
- supID - key
- ManagerLevel - int

Student
- studentID - key
- personID - Foreign key
- grade - int


To use it in php you do the following

Code: Select all

// There are 3 lines you setup before this to configure the database URL, db login, and db password

class PersonModel extends LGDBModel {}
class EmployeeModel extends PersonModel {}
class SuperVisorModel extends EmployeeModel {}
class StudentModel extends PersonModel {}

// Create a new employee and insert it
$emp = new EmployeeModel() ;
$emp->hireDate = '12/10/06' ;
$emp->position = 'code monkey' ;
$emp->name = 'Fred' ;
$emp->birthday = '1/1/74' ;

// This will create a record in both the persons table and the employee table and link them
$emp->insert() ;


// Edit a student record - This shows an alternate way to access fields in the hierarchy to circumvent extended tables with the same field names

// Fetch the record with ID of 10
$student = StudentModel::fetch(10)

// This will change the person record and is more verbose
$student->person()->name = 'Wendy' ;
$student->update() ;

// Delete a record
$student->delete()

// or
StudentModel::delete(10)

// Delete deletes all joined records in the hierarchy, so the above deletes the student record and also the person record
Thats the simplest use of it. The above code is 100% complete and will work as long as you include the files and setup the database info.

You can also set it up to validate fields, add custom fields, join other models in one to one, many to one, and many to many relationships, and also automatically generate forms with it.

If you want the code let me know. Its kind of messy, pretty damn complex, has a lot of comments. Again it was a learning experience so its not the best, but might give you some insight.
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Post by Kieran Huggins »

The theory behind mine is to essentially have a database-saveable XML document, with custom element classes so I can assign things like field validation and property generation.

I save attributes, elements, text and cdata in separate tables for better database performance and easier searching.

The reason I chose this model is because by default it's completely non-restrictive, can be extended to restrict anything/everything based on your own custom element class.

The usage of mine closely resembles Begby's syntax (which isn't all that surprising really) with the exception that I usually attach my nodes to a DOM tree for transformation / output.

I was planning to leave most of the field creation / management in the input form templates, though this is not a requirement.

Any thoughts on l10n? Right now I'm just treating l10n as an attribute so it can be dealt with by XSLT, but I get the nagging feeling that I'm missing something...

Cheers,
Kieran
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

It is interesting to see the different approaches. We have a system mentioned that uses extending base clas and another with an XML map. I would be interesting to developed a layered system that might support a number of options (e.g. extension or composition) and allow wrappers that could do things like put configuration data in XML maps (or PHP arrays, or INI files, etc.).
(#10850)
User avatar
Luke
The Ninja Space Mod
Posts: 6424
Joined: Fri Aug 05, 2005 1:53 pm
Location: Paradise, CA

Post by Luke »

For now, I am only interested in creating the most simple implementation of the data mapper pattern. However, I would like to build it in such a way that adding the flexibility (configuration options) arborint is talking about would not be too difficult in the future. For now, here's a really simplified example for us to pick apart and build upon (this class was pulled from a book I am reading (slightly modified) and a lot of it seems very stupid to me):

Code: Select all

<?php
abstract class DNS_Db_DataMapper
{
    /**
     * Contains simple database connection object
     * @var DNS_Db_Interface
     */
    protected static $_db = null;
    
    /**
     * Class constructor
     *
     */
    public function __construct()
    {
        if (is_null(self::$_db))
        {
            if(DNS_Registry::getInstance()->isRegistered('database'))
            {
                self::setAdapter(DNS_Registry::getInstance()->get('database'));
                return;
            }
            throw new DNS_Db_DataMapper_Exception(
                'You must set a database adapter before you can instantiate a mapper'
			);
        }
    }
    
    public static function setAdapter(DNS_Db_Interface $db)
    {
        self::$_db = $db;
    }
    
    public function load(DNS_Db_Result $result)
    {
        $array = $result->fetchRow(DNS_Db_Result::DB_FETCHMODE_ASSOC);
        if(empty($array))
        {
            return null;
        }
        return $this->loadArray($array);
    }
    
    /**
     * Load an array into object form
     *
     * @param array $array
     * @return DNS_Db_DataMapper
     */
    public function loadArray(Array $array)
    {
        return $this->_doLoad($array);
    }
    
    public function find($id)
    {
        return $this->_doFind($id);
    }
    
    /**
     * Executes a query and returns the resulting object
     *
     * @param string $sth
     * @param array $values
     * @return DNS_Db_Result;
     */
    protected function _doStatement($sth, Array $values)
    {
        $result = self::$_db->execute($sth, $values);
        if(self::$_db->isError())
        {
            throw new DNS_Db_DataMapper_Exception(self::$_db->getError());
        }
        return $result;
    }
    
    abstract public function insert(DNS_Db_DomainObject $object);
    
    abstract public function update(DNS_Db_DomainObject $object);
    
    abstract protected function _doLoad(Array $array);
    
    abstract protected function _doFind($id);
}

/**
 * This class would be for working with a table called 'users'
 */
class Model_User extends DNS_Db_DataMapper
{
    private $_selectStmt;
    
    private $_updateStmt;
    
    private $_insertStmt;
    
    private $_table = 'users';
    
    public function __construct()
    {
        parent::__construct();
        
        $this->_selectStmt = self::$_db->prepare('
			SELECT * FROM `' . $this->_table . '`
			WHERE id=?
		');
        
        $this->_updateStmt = self::$_db->prepare('
			UPDATE `' . $this->_table . '` SET
			`id` = ?,
			`username` = ?,
			`password` = ?,
			WHERE `id` = ?
		');
        
        $this->_insertStmt = self::$_db->prepare('
			INSERT INTO `' . $this->_table . '`
			(id, username, password)
			VALUES
			(?, ?, ?)
		');
    }
    
    public function doFind($id)
    {
        $result = $this->_doStatement($this->_selectStmt, array($id));
        return $this->load($result);
    }
    
    public function doLoad(Array $array)
    {
        $object = new DNS_Db_Domain_User($array['id']);
        $object->setUsername($array['username']);
        $object->setPassword($array['password']);
        return $object;
    }
    
    public function insert(DNS_Db_DomainObject $object)
    {
        $values = array(null, $object->getUsername, $object->getPassword());
        $this->_doStatement($this->_insertStmt, $values);
    }
    
    public function update(DNS_Db_DomainObject $object)
    {
        $values = array(
            $object->getUsername(),
            $object->getPassword()
        );
        $this->_doStatement($this->_updateStmt, $values);
    }
    // ...and so on
}
?>
I don't care for the level of automation johno was talking about. I would prefer to have as much control as possible. I do not mind doing things like this:

Code: Select all

<?php
$User = new Model_User;
$User->set('username', 'Steve');
$User->set('password', sha1('lollylee'));

$mapper = new DNS_Db_DataMapper;
$mapper->insert($User);

$User2 = new Model_User;
$User2->loadId(2);
$User2->set('username', 'Bobby');

$mapper->update($User2);
?>
As opposed to:

Code: Select all

<?php
$User = new Model_User;
$User->username = 'Steve';
$User->password = sha1('poololly');

$User2 = new Model_User;
$User2->loadId(2);
$User2->username = 'Stevey2';
?>
Post Reply