Page 1 of 2

Object Relational Mapping

Posted: Wed Feb 14, 2007 10:29 am
by Jenk
I've never been one to fully click on with terminology, so that leaves me behind sometimes, irritating others by repeating/regurgitating what others have been harping on about for ages.

So; I would like to ask if someone could explain, in laymans terms, what ORM is.

My current understanding, is that it is as 'simple' as extracting data to suit the needs of your Object(s.)

Ala JOIN's, nested SELECT's and so forth in SQL queries, for one example.

Code: Select all

SELECT `users`.`username`, `privileges`.`access_level` FROM `users` CROSS JOIN `privileges` ON `users`.`userid` = `privileges`.`userid`;
For an object containing Users details and their access levels, as an example.

However - I have seen examples posts on these forums (and others,) where they specify that using __get()/__set() is an almost necessity (i.e. it would be difficult to accomodate without them) for ORM - so have I missed out on something? Is ORM more to do with the ability for your objects to polymorph to your data requirements? Either the shape of your DB, or the shape of your required data? Or is it the ability for your DBMS to polymorph to your Object(s) [as per my example above]?

Yours confuzzled,
Jenk.

Posted: Wed Feb 14, 2007 11:43 am
by Maugrim_The_Reaper
I suppose the key term in ORM is "object persistence". Rather than considering things as rows, records, and tables (RDBMS) you want to work solely in terms of objects, with all the OOP goodness that implies. So for every dataset, you have a specific object each of which can reference other objects (e.g. a Book Object holding references to individual Chapter objects).

If you were designing a space strategy game, you would require Fleet, Ship and Location objects. Each Ship would carry a reference to its member Fleet, etc. The question becomes how to take those objects and store them between requests - i.e. persistence. The obvious solution is to a database. Of course a database (unless a newfangled OD) is typically relational - it stores scalar values not objects. So ORM is the method used to map Objects to record(s) on a database (being the main method of object persistence) and back again.

If you take a simple example (let's presume a 1-1 Object->DB Row mapping) - you have a User Object. This object has id, name, email and website properties. In the simplest type of mapping, each User refers to a database table record with its properties mapping to fields on that record. Creating and storing a User includes a few steps:

1. Assigning property values
2. Generating INSERT sql
3. Executing the INSERT
4. Assigning the id (a table primary key with auto increments) on the origin User Object.

Steps 2-4 are common to the similar Data Access Object pattern - it's a simple CRUD operation.

Updating a User follows the same pattern - optionally a User object could track which fields change to limit the UPDATE statement.
Fetching a User should rely on the primary key with a fetch method like:

Code: Select all

$user->fetchByPk(1);
Where things get complicated is when you depart a strictly 1-1 mapping. Say we decide our User object will hold a reference to a Settings object. So we can grab settings using $user->settings->somekey or $user->getSettings()->get('key') whichever is preferable. This is perfectly acceptable in OOP but our ORM system now needs to recognise that a User object can map to 2 tables - the object is persisted on two rows - a user record and a settings record, likely linked by a foreign key (user id probably).

Still. Creating our object should stay simple.

Code: Select all

$user = new User;
$user->name = 'Jenk';
$user->email = 'jenk@example.com';
$user->website = 'http://www.example.com';

$settings = new Settings;
$settings->somekey = 'somevalue';

$user->settings = $settings;
$user->save(); //also saves Settings automatically and handles assigning the user's id primary/foreign key
Back to some of the syntax. __get and __set (or named accessors) are usually needed for these kinds of objects since setting a value can require some additional work - also it makes it simpler to refactor later. You might want to track which fields are changed, or handle values from any assigned objects. You'll also likely have other methods in such objects to check existence (is the object new, already persisted, in need of an update?) among other things.

Here my ramble ends...:). Not sure if that was an okay explanation or if it's still a bit vague for what you wanted...

Posted: Wed Feb 14, 2007 1:39 pm
by Jenk
That's great, thank you. :)

I shall add more questions/discussion when I get more time!

Posted: Wed Feb 14, 2007 3:03 pm
by johno
JENK: This older thread might help you.

Posted: Wed Feb 14, 2007 4:03 pm
by Christopher
Which reminds me that I still need to add clean/dirty, insert/update and delete to my example.

Posted: Sun Feb 25, 2007 10:57 pm
by Jenk
Right.. pretty sure I have got it now, thanks to the old thread provided by johno, and Maugrim's explanation.

So, for an example, would this be an ORM?

Code: Select all

<?php

class Book // as just about every example I see uses "Book"
{
    private $_db;
    private $_data;

    public function __construct ($db, $id)
    {
        $this->_data = $db->exec('SELECT * FROM `books` WHERE `id` = ' . intval($id));
        $this->_db = $db;
    }

    public function __set ($name, $val)
    {
        if (array_key_exists($name, $this->_data) && $name) $this->_data[$name] = $val;
        else throw new Exception('Column "' . $name . '" does not exist.');
    }

    public function __get ($name)
    {
        if (!isset($this->_data[$name])) 
                throw new Exception('Data item "' . $name . '" does not exist.');

        return $this->_data[$name];
    }

    public function save ()
    {
        $i = 0;
        
        foreach ($this->_data as $key => $val)
        {
            $set .= ($i > 0 ? ', ' : '') 
                . mysql_real_escape_string($key) 
                . "='" . mysql_real_escape_string($val) . '"';
            $i++;
        }

        $this->_db->exec("UPDATE `books` SET {$set} WHERE `id` = '" 
                . mysql_real_escape_string($this->_data['id']) . "'");
    }
}

$book = new Book(new DB_Connection(/* blah .. */), 1);
$book->title = "Jenk's daily digest!";
$book->save();

?>
(ignoring any syntactical or minor mistakes)

Posted: Mon Feb 26, 2007 12:57 am
by Christopher
No, that's Active Record. With O/RM you would do:

Code: Select all

$mapper = new DataMapper(new DB_Connection(/* blah .. */), 1);

$book = $mapper->getInstance('Book');
$book->title = "Jenk's daily digest!";

$magazine = $mapper->getInstance('Magazine');
$magazine>title = "Jenk's a star!";

$mapper->save();     // saves both book and magazine objects
The key to O/RM is that the objects have no persistence logic. The mapper handles that.

Posted: Mon Feb 26, 2007 4:29 am
by Jenk
Right.. I'm getting there I think..

so when $mapper->save(); is called, it will collect the details from the $book and $magazine object, then save to the DB (or other source should it be so - which is half the point, the $book and $magazine objects don't know, nor care, if it is an INI nor SQL data source.) So an almost observable like collection is needed.

Posted: Mon Feb 26, 2007 5:28 am
by Jenk
I've made this very simple case.. I believe it to be correct with ORM? (obviously the save() method would actually save, rather than dump.)

Code: Select all

<?php

class Mapper
{
    private $_objects = array();

    public function getInstance ($name)
    {
        if (!isset($this->_objects[$name])) $this->_objects[$name] = new stdClass;

        return $this->_objects[$name] = new stdClass;
    }
    
    public function save ()
    {
        foreach ($this->_objects as $name => $obj)
        {
            echo "<pre>{$name}\n";
            foreach ($obj as $foo => $bar)
            {
                echo "\t{$foo} = {$bar}\n";
            }
            echo "</pre>\n";
        }
    }
}

$map = new Mapper;

$bleh = $map->getInstance('woo');
$blah = $map->getInstance('foo');

$bleh->foo = 'bar';
$blah->bar = 'foo';

$map->save();

?>
I'm getting confused now though, because I just spotted a PHPBuilder tutorial that shows my first example to be more correct than the latter :\

http://www.phpbuilder.com/columns/mathi ... 40309.php3

(Though I am more inclined to believe the tutorial is mistaken (or rather my take on it) as it makes perfect sense that my first example was/is an active record.. it's a single record, and it's active.. tada!)

Posted: Mon Feb 26, 2007 8:06 am
by Maugrim_The_Reaper
The key to remember is that ORM involves Mapping. So any set of objects, can map to any set of database tables and not necessarily on a 1 to 1 basis. This is where the ActiveRecord differs - it's a 1 to 1 mapping where the object is fully aware of the datasource. So you have two basic properties to watch for:

Does my object know how to save itself?
Does my object always map to a single table row?

Answering Yes to the first means its not ORM. The second is more difficult, but it's more a general pointer that an ORM should allow M to M style mapping, where 1 to 1 is valid but not necessarily the best solution. Taking a Book, assuming you can store one Book to a database is ridiculous - you would want to store Book, Chapter, Page and other related details (Author, Publisher). So while your base object is Book, this maps to a far more than 1 table. Of course you can assume a Book is just an object collection of Table objects but that defeats the point - the Objects are *not* supposed to know that. ;) That thinking is a developer working from the Database Schema up, and ignoring how the objects should be organised to make for better design. In a perfect world you'd have an upfront idea of what objects are needed (Domain), before you jump into database design.

Working in objects only:

Code: Select all

$book = new Book;
$book->title = 'ORM Ramblings or How To Confuse Jenk!';

$publisher = new Publisher;
$publisher->name = 'Maugrim Inc.';
$publisher->type = 'Printed Media';

$book->publisher = $publisher;
$book->save();
Note: Assume the Mapper is called by an object's save() method - just a shortcut proxy passing in a "self" instance.

In the above, it's perfectly good OOP. We have a book, the book has a publisher. Both are representative objects. When the save() method is called, the Mapper will kick in. It will know exactly how to save Books. If you think from the Schema up you might be led towards to a more detailed ActiveRecord/Data Access Object style of:

Code: Select all

$book = new Book;
$book->title = 'ORM Ramblings or How To Confuse Jenk!';

$publisher = new Publisher;
$publisher->name = 'Maugrim Inc.';
$publisher->type = 'Printed Media';
$publisher->save(); // need to save so the primary key is set

$book->publisher_id = $publisher->id;
$book->save();
If you check, the main difference is that the mapping just shifted. Instead of Book mapping to a _book and _publisher tables (1 to many), we now have each object needing to be saved as an individual (1 to 1). The first is ORM, the second is more likely A/R or Data Access Object (assuming the DAO is also just proxied by save() ;)). One other point is that the publisher is no longer a property of a Book - they are now separated objects. Book only holds the Publisher ID. It's clearly more tied in to the database schema than any OOP practice.

Posted: Mon Feb 26, 2007 8:23 am
by Jenk
Gotcha..

So a 'real world' example (which is more prevailent in my php history) would be users and their associated permissions, posts, threads, buddy's etc..

Code: Select all

$settings = Registry::get('SETTINGS');
$jenk = new User('Jenk');
$jenk->setPermissions(new Permissions($settings));
$jenk->createThread('This is the title..');
$jenk->subsribeToThread(123);
$jenk->addBuddy(new User("Jenk's little helper"));
$jenk->save();

Posted: Mon Feb 26, 2007 8:51 am
by Maugrim_The_Reaper
As a complex example - pretty much. Just remember that mapping all object to a single root object doesn't always work or is even desireable. Keep it focused on what an object definitely needs (think the 80/20 rule). In an ORM, calling the getbyPk() method should be capable of reconstructing the originally persisted object to its previous state (i.e. with any Permission/User objects which are children of the main User object).

This is where you see how ORM differs - it's persisting an OO structure composed of related objects to the database, via a Mapper of some sort, not simply a list of scalars.

All this said, it also shows how complex it can get which is why in small applications the A/R pattern and other similer Patterns can be more practical for smaller sets of data. It's usually the simplest and most direct form of object to database abstraction since the object itself is aware of the database schema and often the access methods (i.e. little need for a Mapper).

Posted: Mon Feb 26, 2007 8:58 am
by Jenk
Indeed, it's also become apparent why there are quite large frameworks available for ORM! (It was mostly this that had me intrigued - "why would something like that need a large framework?")

Posted: Mon Feb 26, 2007 9:05 am
by Maugrim_The_Reaper
Reading your example - differentiate between the model and controller code.

A User object containing a Permission object makes sense - both are related sets of data with Permission belonging to a single User. Composition works since this is a model. createThread() is not specific to a user - it's not data, and likely belongs in the controller layer of your overall application. If you consider it another way - adding a createThread() method to User doesn't seem to promote reuse and likely adds coupling since you'll need a Thread object along the way from somewhere. subscribeToThread() is also more C than M, where is the subscription persisted as data?

Posted: Mon Feb 26, 2007 9:13 am
by Jenk
I did have that as "addThread()" but changed it when I added "subscribeToThread()"

Yes, I agree, they would be seperated. subscribeToThread() is simply the thread subscription for notification and so forth. The data itself would be on it's own table, which would have probably nothing more than 3 columns (user id, thread id, notification [y/n])