Creating an ORM for PHP and would like some input

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

User avatar
m4rw3r
Forum Commoner
Posts: 33
Joined: Mon Aug 03, 2009 4:19 pm
Location: Sweden

Creating an ORM for PHP and would like some input

Post by m4rw3r »

Hi!

(This is my first post here, so bear with me :P)

I'm creating an Object-Relational-Mapper for PHP as a school project (it affects my grade etc. but it is permitted to be open source) and I've come quite a long way on it.
But I'm still not sure I'm going in the correct direction with it, as there is so much which can be done (Just look at Doctrine or something similar).

The reason I'm creating my own ORM is because I want something which is easy to use, fast (both in terms of developing with it and in terms of performance),
secure (ie. escape and the usual stuff) and flexible.
I've decided to use PHP 5.2, as 5.3 is going to take some time (I really want to use 5.3, but I think it may be an overkill in most areas
(I know ... closures are really nice)), and it also shouldn't require any external dependencies except for the database extension for the server the user is going to use.

Current status:
- DataMapper which maps plain PHP objects (only requires public properties)
- Database Abstraction (I'm not using PDO, as it is harder to cache and also slower if I try to add my own methods to the result objects, objections with motivations welcome ;))
- Multiple database connections (with support for write rerouting, eg. for master-slave databases)
- Query builder (for SELECT, INSERT, UPDATE and DELETE
- Code generator for specialized mappers (for speed and avoids complex dynamic methods), transparently generates the mapper
- Relations (Has One, Has Many, Belongs To, Has And Belongs To Many (HABTM))
- Eager loading (explicitly defined/called)
- Related objects aren't loaded automatically, instead it forces the user to utilize a method call to load them (this is good, because the user realizes that it issues a query)
- JOINs (eager loading) of related records can be performed in almost infinite levels (wonder when it is becoming too much for PHP to process, though)
- Configuration written in PHPdoc (will support other configurations as well, but I like the PHPdoc because it stores the configuration in the mapped object without interfering)

Things I later will add:
- Table builder
- Mapping with getters/setters
- Transactions (need to be finished with the save() method, as I will use transactions for it as it saves relations too)
- Unit of Work

I've been developing this for a few months, and when I started I did not have a "great" knowledge of Design Patterns
(That means I will rewrite a lot of the base, which I wrote early on).

So what I would like is some feedback on my decisions and also some input about what you think makes a great Object-Relational-Mapper.

Thanks!

PS. I will hopefully have a website with a forum, manual, download etc. up quite soon (a month or two, I hope it won't be longer).
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Creating an ORM for PHP and would like some input

Post by Christopher »

This is a huge topic. Perhaps you could get a little more specific about the parts you have questions about. Each point in your "current status" section could be a full discussion.
(#10850)
User avatar
m4rw3r
Forum Commoner
Posts: 33
Joined: Mon Aug 03, 2009 4:19 pm
Location: Sweden

Re: Creating an ORM for PHP and would like some input

Post by m4rw3r »

Ok, I can start by showing some usage examples, where I use plain PHP objects:

Code: Select all

 
/**
 * @ot
 */
class Track
{
    /**
     * @ot id
     */
    public $id;
    
    /**
     * @ot
     */
    public $name;
    
    /**
     * @ot
     */
    public $artist;
}
 
$track = new Track();
 
$track->name = 'A Scenery of Loss';
$track->artist = 'Draconian';
 
Db::save($track);
 
echo $track->id;
 
$tracks = Db::find('track')->where('artist', 'Draconian')->get();
 
foreach($tracks as $t)
{
    echo $t->name;
}
 
The @ot doc comment tells my ORM that it should map the class and properties (so if you omit a comment, it won't be mapped).

What do you think about this?
User avatar
Jenk
DevNet Master
Posts: 3587
Joined: Mon Sep 19, 2005 6:24 am
Location: London

Re: Creating an ORM for PHP and would like some input

Post by Jenk »

I think you should not use comments for functionality. :) How about having a method that returns an array of mappable items? Or better yet, have descriptors similar to something like this:

Code: Select all

public function describeTracks ($table) {
  $table->mapTo(Track);
  $table->addColumn("id", "id", "Integer"); //column name, property name, type
  $table->addColumn("name", "name", "String");
  $table->addColumn("artist", "artist", "String");
  $table->addIndex(new PrimaryKey("id"));
  // or ...
  $table->setPrimaryKey("id");
}
You could also do joins, if for example the $artist was infact an object of Artist type, and not just the name as a string..

Code: Select all

$table->addColumn("artist", "artist", new ForeignKey("artists", "id")); // on table "artists" column "id"
Last edited by Jenk on Tue Aug 04, 2009 9:29 am, edited 1 time in total.
User avatar
m4rw3r
Forum Commoner
Posts: 33
Joined: Mon Aug 03, 2009 4:19 pm
Location: Sweden

Re: Creating an ORM for PHP and would like some input

Post by m4rw3r »

Seems like a good idea, I like it when it is seprated from the object like that.
I've seen it before, but then it was in Python, which enables the user to make a few calls.

The problem is that I want the mapped classes to be (almost) completely independent of the mapper, ie. no extra methods etc.
And hence is the configuration decoupled from the mapped class/object, this poses a problem: How can I get the configuration of a related class?
I need the configuration for the related class as it automatically guesses the foreign key names from the singular (usually class name) and the primary key.
Maybe register a static method / function for the configuration? But I'm not sure that is the best way.

Currently I do like this:
1. Load the class settings from PHPdoc
2. Normalize settings for that class (including relation data, but do not guess any columns/tables yet)
3. Load all the settings for related classes
4. Normalize those
5. Start the relationship guessing for the first class
6. Save that configuration
7. Create the mapper for the first class
(The mapper for the other classes are created when they are requested)

I'm currently doing relations like this:

Code: Select all

// in eg. class group:
/**
 * This property contains the related users.
 * Use the default columns:
 * @ot relates to_many user objects
 */
public $users = array();
That will add a group_id property to the user object if it doesn't exists (dynamically, so it becomes a public property which is added during execution).

One final thing (and a bit off topic), I don't like the camelCase function/method names, but it seems like a lot of people do, so what is your recommendation there?
User avatar
Jenk
DevNet Master
Posts: 3587
Joined: Mon Sep 19, 2005 6:24 am
Location: London

Re: Creating an ORM for PHP and would like some input

Post by Jenk »

You'll simply not cover all the possibilities if you do not have some form of description, be it a comment, or a separate object, or an abstract method that must be implemented. It would not be possible to tell which property is persistent, which isn't, which is a composite, which is a literal, etc. etc. :)

As for camel casing or not.. it's purely preference, but I think the majority dislike underscore separation.
User avatar
m4rw3r
Forum Commoner
Posts: 33
Joined: Mon Aug 03, 2009 4:19 pm
Location: Sweden

Re: Creating an ORM for PHP and would like some input

Post by m4rw3r »

I know I need some description, but the problem is where to place it.
I want my ORM to be able to be used in any framework / PHP script, and therefore it needs to be adaptable.
That, in turn, means that it should not impose any restriction on where the classes to map are located, nor where their descriptions are either.
User avatar
Jenk
DevNet Master
Posts: 3587
Joined: Mon Sep 19, 2005 6:24 am
Location: London

Re: Creating an ORM for PHP and would like some input

Post by Jenk »

I don't see why it would be a restriction to just map the objects in a descriptor? Other ORM's manage this just fine. You don't even need to have a particular interface implemented if you go this route. :)

Code: Select all

// this is the users object, note how it doesn't need any descriptions attached to it..
class Person {
  public $id;
  public $name; 
  public $age;
}

Code: Select all

// .. because they are all encapsulated in it's own object here..
// where Descriptor is from your ORM library
class DescriptorForPeople extends Descriptor {
  public function describePerson($table) {
    $table->setClass(Person);
    $table->setTableName("People");
    $table->addColumn("id", "id", "Integer")->bePrimaryKey();
    $table->addColumn("name", "name", "String");
    $table->addColumn("age", "age", "Integer");
   }
}
User avatar
m4rw3r
Forum Commoner
Posts: 33
Joined: Mon Aug 03, 2009 4:19 pm
Location: Sweden

Re: Creating an ORM for PHP and would like some input

Post by m4rw3r »

Well, the thing is: how to make it easy to include? Autoloader in an assigned directory?
(I use the autoloder with the "standard" conversion of foo_bar -> foo/bar.php for my ORM currently)
Because I don't think it is good practice to store two classes in the same file.

Something like this?

Code: Select all

require 'db.php';
 
Db::setDescriptorDirectory('./descriptors');
 
// add a certain descriptor class, not located in the descriptor directory:
Db::addDescriptor('foobar');
// should I have descriptor objects too? ie. you can create a descriptor object on the fly?
// I see a certain need of it, but as all mappers are compiled for speed, I'm not sure that is a good idea
 
require 'user.php';
 
$u = new User();
$u->name = 'Foobar';
 
Db::save($u);
I also guess I have to rename the Db class (all other classes are namespaced (not PHP 5.3 namespaces, but just prefixed with a "foo_")), because it is too common, am I right?
User avatar
Jenk
DevNet Master
Posts: 3587
Joined: Mon Sep 19, 2005 6:24 am
Location: London

Re: Creating an ORM for PHP and would like some input

Post by Jenk »

I never suggested you do have both in same file :)

You could have it stored in a specified location, then the ORM will require it based on it's name.. so if it needs a descriptor for Person, the relevant name would be DescriptorForPerson (which is contrary to my previous example, of course)

Then it'd be a case of require/include "DescriptorFor${className}";
User avatar
m4rw3r
Forum Commoner
Posts: 33
Joined: Mon Aug 03, 2009 4:19 pm
Location: Sweden

Re: Creating an ORM for PHP and would like some input

Post by m4rw3r »

I never said you did, but it is the logical conclusion to store the descriptor with the class if the library doesn't have a default location for it.

Anyway, the descriptors sounds like a good way of storing the configuration (which also means that the user can override a lot of the logic generating special things, like foreign keys etc., in a "descriptor-template" for the project).
But I wonder about the people who like YAML and XML for their configurations, maybe I should make it possible to initiate descriptors on-the-fly, because of the usage of a certain XML/YAML adapter?
Or should I just ignore them?
(but it seems like many like XML/YAML for configurations)

Also, I'm soon going to implement the identity map (which is a map containing all the mapped object instances linked with their primary key, so there won't be two objects referencing the same row during a request), and I wonder about this:
How should it behave?

Two scenarios (simplified code, $map is the identity map, $obj is the mapped object and $data is the result row):
1:

Code: Select all

$id = $data->id; // create the unique pk
 
if(isset($map[$id]))
{
    return $map[$id];
}
else
{
    return $map[$id] = $this->createObject($data); // overly simplified, the createObject() method is normally compiled into $obj->foo = $data->foo; for speed
}
2:

Code: Select all

$id = $data->id;
 
if(isset($map[$?d]))
{
     $this->assignData($map[$id], $data);   // update the data in the object
     return $map[$id];
}
else
{
     return $map[$id] = $this->createObject($data);
}
The first approach means that code like the block below only executes one query, because they both reference the same pk (2), and the mapper can check the $map before issuing the query:

Code: Select all

$o = Db::find('user', 2);
$o = Db::find('user', 2);
But imagine the fact that the user made a custom SQL query, which changed the user object.
I don't want to perform parsing of every SQL query to update the objects, as the event of a custom query affecting an instantiated object is quite small and the performance impact is huge.

Instead the second approach does the query again, but only updates the existing object instances (which means that they don't get out of sync of eachother).

PS. Looks like I'm going to rewrite about 70% of the code, because I've been reading the book Design Patterns: Elements of Reusable Object-Oriented Software.
I don't mind, that is usually how I work, write a lot of code, test it, then refactor/rewrite it into less code and smaller chunks of code.
User avatar
Jenk
DevNet Master
Posts: 3587
Joined: Mon Sep 19, 2005 6:24 am
Location: London

Re: Creating an ORM for PHP and would like some input

Post by Jenk »

You can't cater for everyone :) Personally I hate YAML/XML/SOAP/etc. I'm writing PHP, and don't want to use a DSL. :)

But having said that, it wouldn't be much trouble to write a *ML parser(s) for the descriptors, that could parse something like:

Code: Select all

<table name="People" class="Person">
  <column name="id" type="Integer" property="id" bePrimaryKey="true" />
  <column name="name" type="String" property="name" />
  <column name="age" type="Integer" property="age" />
</table>
or similar.
User avatar
m4rw3r
Forum Commoner
Posts: 33
Joined: Mon Aug 03, 2009 4:19 pm
Location: Sweden

Re: Creating an ORM for PHP and would like some input

Post by m4rw3r »

Yeah, figured as much. So I will probably start the rewrite tomorrow; first starting from scratch and building the basic structure, and then I will try to reuse the code from earlier (but if that isn't possible, I'll just reuse the logic).

But what about the identity map question, what do you think? only load the new records? or also update already loaded records?
(First means a teeny bit less queries, while the other probably is less error prone. From what I've seen, most use the first approach by default)
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Creating an ORM for PHP and would like some input

Post by Christopher »

For the different format maps, I would recommend having all formats generate a standard array based map. Then you can support any format you want.
m4rw3r wrote:But what about the identity map question, what do you think? only load the new records? or also update already loaded records?
(First means a teeny bit less queries, while the other probably is less error prone. From what I've seen, most use the first approach by default
The only difficult think about Identity Map is deciding what to use as the key. That is usually the criteria for uniqueness. Simple for records with just a PK, but how to you describe more complex keys?
(#10850)
josh
DevNet Master
Posts: 4872
Joined: Wed Feb 11, 2004 3:23 pm
Location: Palm beach, Florida

Re: Creating an ORM for PHP and would like some input

Post by josh »

Template method on the mapper. Abstract mapper attempts to discover the schema and inflect methods / properties from field names, user can override and manually tell it what to map. I would focus on this then you can always write a class that implements the builder pattern to read XML. Don't get hung up on XML syntax, make your API in code first
Post Reply