Page 1 of 1

Rolling Your Own ORM

Posted: Thu Dec 04, 2008 12:01 pm
by volomike
You know, it dawned on me the other day, using PHP5's OOP features with: __construct(), __call(), call_user_func_array(), __get(), and __set(), one could build their own ORM class in a really cool way. It would look like:

Example # 1

Code: Select all

$oRecord = new ORM();
$oRecord->FirstName = 'Volo';
$oRecord->LastName = 'Mike';
$oRecord->ID = 'GENERATE_ID';
$oRecord->Unquote('ID');
$oRecord->Save('users','id = ' . $oRecord->ID);
This would automatically build an empty record object with no properties, and then fill it with 3 columns, convert the column names from property-style to db-style, determine if we need to use an INSERT or an UPDATE based on the where clause we pass in Save(), and then run the SQL. It could also handle checks on the database field values to ensure they are not catching any SQL injection. Also, it would ensure that the UPDATE or INSERT only used applicable columns that it had been set -- a key point in the design. The Unquote() call could ensure that ID is not enclosed with single quotes around it (where, by default, everything would get quotes unless you did this). The 'GENERATE_ID' could be automatically detected by the __set() call and it could replace it with a unique primary key value pulled from a PostgreSQL Sequence, or whatever you want on MySQL, or whatever -- a unique ID.

The __set() routine would also have to be smart and throw an error if one tries to set a property that is being used as a reserved class method like Save(). Although I haven't tested -- perhaps PHP handles that on its own.

or

Example # 2

Code: Select all

$oRecord = new ORM('users','id = 433');
This would automatically build a users record for the row with ID = 433. One could then either update this with Save() or just roll through the properties on it with a foreach($oRecord as $sKey => $sVal), and doing an if/then check on whether the value is an array (such as for a property collection variable that we don't want to misinterpret as a column name).

or

Example # 3

Code: Select all

$oRecord = new ORM('users', 'id = 433');
$oRecord->FirstName = 'Mike';
$oRecord->Save('users','id = 433');
This would automatically build a users record for the row with ID = 433, overwrite the FirstName with a new value, and save it back to the same record in the database table. However, since we created a new ORM object with a table designation and a row, the UPDATE statement in this case has every column and is unlike the first example.

You could also add class methods like Remove(), Update(), Insert(), and Select() to provide even more control.

Another opportunity could be to handle multiple records like so:

Example # 4

Code: Select all

$oRecords = new ORM('users',"last_name = 'Smith'", MULTIPLE);
foreach ($oRecords as $oRecord) {
  // do something with properties or class methods of $oRecord
}
Some other things to think about -- with PostgreSQL and MySQL, there are tables you can inspect, or PHP functions you can call, to determine whether columns are going to need a single quote or not. Therefore, in examples 2, 3, and 4, you wouldn't need to run Unquote() because it would already know. Of course, that has a performance hit, so you could perhaps have a parameter you could pass to turn that off, or you could cache that information in shared memory.

And here's some handy functions I wrote that convert object Property Names to/from DB Column Names, and which are used when calling __construct(), Save(), Remove(), Update(), Insert(), or Select() (-- another key point). Note these change the original value and don't return a value, saving you a step.

Code: Select all

public function MakeObjectKeyName(&$sKey) {
    $sKey = strtolower($sKey);
    $sKey = str_replace('fkey_','fk_',$sKey);
    $sKey = str_replace('dt_','Date_',$sKey);
    $sKey = str_replace('_',' ',$sKey);
    $sKey = ucwords($sKey);
    $sKey = str_replace('Ip ','IP ',$sKey);
    $sKey = str_replace(' Ip',' IP',$sKey);
    $sKey = str_replace('name','Name',$sKey); //not a bug -- I conjoin name this way usually
    $sKey = str_replace(' Id',' ID',$sKey);
    $sKey = str_replace('Id ','ID ',$sKey);
    $sKey = str_replace(' ','',$sKey);
    $sKey = ($sKey == 'Id') ? 'ID' : $sKey;
    $sKey = str_replace('Fk','FK',$sKey);
}
 
public function MakeDBKeyName(&$sKey) {
    foreach (range('A','Z') as $c) {
        $sKey = str_replace($c, "_$c",$sKey);
    }
    $sKey = str_replace('_F_K_','_fkey_',$sKey);
    $sKey = str_replace('_I_D_','_id_',$sKey);
    $sKey = str_replace('_I_D','_id',$sKey);
    $sKey = str_replace('_I_P_','_ip_',$sKey);
    $sKey = str_replace('_I_P','_ip',$sKey);
    $sKey = str_replace('_Name','name',$sKey); //not a bug -- I conjoin name this way usually
    $sKey = str_replace('_Date_','_dt_',$sKey);
    $sKey = str_replace('_Date','_Date',$sKey);
    if (strpos($sKey, '_') == 0) {
        $sKey = substr($sKey, 1);
    }
    $sLast = substr($sKey, -1);
    if ($sLast == '_') {
        $sKey = strrev($sKey);
        if (strpos($sKey, '_') == 0) {
            $sKey = substr($sKey, 1);
        }
        $sKey = strrev($sKey);
    }
    $sKey = strtolower($sKey);
}
You could play with those column translation functions as you see fit, using your own tastes.

P.S. Also, some newbies who read this might not know you can usually convert a database record or an array into an object like so:

$oRecord = (object) array('FirstName' => 'Volo', 'LastName' => 'Mike');

and

$oRecord = (object) $rwRow;

..which might be useful in your ORM design.

Re: Rolling Your Own ORM

Posted: Thu Dec 04, 2008 12:18 pm
by volomike
I'd also like to mention why I think an ORM is important. Some people like to store all their SQL as parameterized SQL in a central conf file, then call it, set the parameters, and run it. Some people may want to spray their SQL all throughout their project. However, let's look at that.

When you use parameterized SQL, you have to spend several hours on projects in the process of building SQL statements, and then running a routine to load that statement, set parameters, and then run it. Also, when you use SQL to get a result back, you then have to set variables to store column values unless you like playing with the actual record row as it comes in record resource from the database. So, wouldn't it be nice to have much of this busy work done for you? That's where an ORM can help.

When you spray your SQL all throughout the project, it just looks messy and future programmers who inherit your work will find it hard to interpret SQL statements like this:

Code: Select all

$sSQL = "SELECT " . $sTable . " WHERE " . $sCol[45] . " = '" . Filter::CleanMyValue($sVal[45]) . "';";
Of course, that's an oversimplification of what I've seen -- most projects I've seen that do that sort of monkey-business have far longer SQL statements that wrap several lines, and it's a madhouse to have to piece through this to determine what the heck is going on. Oh, and I love projects that do this craziness:

Code: Select all

 
$sSQL="SELECT ".$sTable." WHERE ".$sCol[45]." = '".Filter::CleanMyValue($sVal[45])."';";
I wish people would put spaces before and after their dot concats because it makes otherwise it makes it hard to read.

So, this too is also where an ORM can help and make things more intuitive for future programmers.

Then comes the case of buy vs build vs look for F/OSS for your ORM. In my case, I like to build one, but keep it like the simple, straight-forward example I had in my first post in this thread. That way, everyone coming to it can pretty much understand what I'm doing. I also like to build it because then I don't have to wait for someone else to fix their ORM framework bugs.

I may change my thoughts on this as I move along in time, but right now, this is how I feel.

Re: Rolling Your Own ORM

Posted: Thu Dec 04, 2008 3:51 pm
by volomike
Someone mentioned to me, yes, but what if I want to select from multiple tables? Hey, that's perfectly fine. That's why I was going to propose a Select() method, and it would work sort of like this:

Code: Select all

$oRecords = new ORM(FALSE,FALSE,MULTIPLE);
$sSQL = "
SELECT
  u.first_name,
  u.last_name,
  ind.label
FROM
  users AS u
JOIN
  industries AS ind
ON
  u.fkey_industry_id = ind.id
WHERE
  u.dt_created >= '2008-01-01 00:00:00'
ORDER BY
  u.last_name, u.first_name;
";
$oRecords.Select($sSQL);
foreach ($oRecords as $oRecord) {
  //do something with $oRecord, which would contain properties FirstName, LastName, and Label
}
 

Re: Rolling Your Own ORM

Posted: Thu Dec 04, 2008 4:48 pm
by jayshields
Sorry if this is going to drag the thread off-topic, but I've got a few questions about ORMs in general, which your previous post sort of spouted, because it contradicts any plus points for ORMs that you've previously made.

I've got a huge project right now, which works with a MySQL database from both Java and PHP, and most of the queries are as long and complex as your previous post example, if not more so. (side topic to this would be "is it worth learning/writing 2 ORMs for both the languages or just doing it as normal?")

More on-topic...as I understand, ORMs are used to grab objects from a non-ORDBMS, eg. a RDBMS. So, straight away, you can't just grab partial objects - because if I've got a Waiter class, I'm not going to let you initialise an instance of it with just the Waiters age. You need to grab the whole row each time - performance hit. Another thing is if I create a class for, say, a Waiter, it won't have an ID in it, but in my database I've got an ID. So, if I want to get a Waiter ID from my database how do I do it? Another thing related to partial objects - what if I want every Menu Item from my database, along with how many times each one has ever been ordered? What object do I put that data into?

To me ORMs seem like you're grabbing the data from a RDBMS, putting it into an object just to get it straight back out of the object again. What advantage does having it in an object ever give you?

If I wanted persistent objects you'd use serialization or an ORDBMS, right?

Re: Rolling Your Own ORM

Posted: Thu Dec 04, 2008 5:53 pm
by Eran
For web applications, I feel that a DataGateway is a better pattern than a full-blown ORM. An ORM seems more appropriate for persistent applications.

Re: Rolling Your Own ORM

Posted: Fri Dec 05, 2008 12:41 am
by Christopher
We've had a long discussion elsewhere about Gateways and Mappers, and yes pytrin is right that a Gateway may be more what jayshields is looking for. A key feature of DataMappers and OR/M is that they do mapping between objects and tables. They are a solution to the problem of when either the objects or the tables can change -- especially if those changes are beyond your control. So the first question you need to ask is whether you need that solution. If you don't then Gateways can provide much the same functionality without the complexity and overhead of doing the mappings.

Re: Rolling Your Own ORM

Posted: Fri Dec 05, 2008 1:33 am
by volomike
I use an ORM because I realized in a sense I was making one on each page when I used my DB abstraction class and was putting data in or out of the database. I am constantly looking at ways to type less, have leaner code, work faster for my clients, and so an ORM can do that for you. It can cut the "busy work" out of interacting with your database.

You might want to ask yourself -- in a sense are you like building an ORM with every page you work on as you work to get stuff in and out of your database? If you are, then you probably need to just get an ORM or roll your own.

However, typically I find that ORM frameworks have reams and reams of code behind them, or require a lot of special files to be edited before they can be used, or have heavy dependency installation issues -- what a mess. Outlet ORM was a step in the right direction, but then I thought I could do it even leaner than that as I mentioned in the first post of this thread.

I think I have very strong skills in SQL, so using an ORM to get away from that wasn't my goal. It just was that I was tired of typing all that out. For instance, who wants to type out all those UPDATE and INSERT statements all the time when you can have a tool do it for you? It saves a tremendous amount of time.

Re: Rolling Your Own ORM

Posted: Fri Dec 05, 2008 3:05 am
by Christopher
I am wondering from your code and descriptions whether you are actually implementing Gateways and not doing OR/M.

Re: Rolling Your Own ORM

Posted: Fri Dec 05, 2008 3:40 am
by VladSun
I have some questions and comments on implementation:
- if you create an ORM object with MULTIPLE param set to true, will the Save() method work?
- if you create an ORM object from a SQL query with JOINS, will the Save() method work?
- I think it will be better if you precheck what fields exists in the DB table, so you will be able to raise an error when a non existing field is assigned a value. This way, there is no need to use __set()/__get() methods;

Re: Rolling Your Own ORM

Posted: Fri Dec 05, 2008 9:34 am
by volomike
VladSun wrote:I have some questions and comments on implementation:
1. if you create an ORM object with MULTIPLE param set to true, will the Save() method work?
2. if you create an ORM object from a SQL query with JOINS, will the Save() method work?
3. I think it will be better if you precheck what fields exists in the DB table, so you will be able to raise an error when a non existing field is assigned a value. This way, there is no need to use __set()/__get() methods;
Of course I'm speaking figuratively here, because I haven't actually written this thing. I'm working with a variation of it on a large project, but then I got to thinking how it could be even better, and so I started this thread with that thought.

1. No, not on the collection it returns, but on the individual objects that collection contains.

2. No, in fact, it won't even have Save() and other class methods if it detects multiple tables. It would just have properties if it's one row returned, or a collection of oRecord if several rows returned.

3. That pre-check has a performance hit if you do that every time. I suppose one could have an optional parameter to do that pre-check if one wants it.

Re: Rolling Your Own ORM

Posted: Fri Dec 05, 2008 10:06 am
by Christopher
In looking at your code above, I think instead of the parameter-y:

Code: Select all

$oRecords = new ORM('users',"last_name = 'Smith'", MULTIPLE);
I would rather see something like:

Code: Select all

classs users extends ORM {
    function init() {
        $this->where = "last_name = 'Smith'"
        $this->whatever = MULTIPLE;
    }
}
That way you could just load a "users" object and use it without configuration about it leaking into a higher layer.

Re: Rolling Your Own ORM

Posted: Fri Dec 05, 2008 10:46 am
by volomike
arborint wrote:That way you could just load a "users" object and use it without configuration about it leaking into a higher layer.
Fascinating. I didn't think about it that way, but what you're saying here is that you use ORM as your tool, and then people would interact with the abstraction of that tool. Now, the ORM is an abstraction on the swappable DB class, so this is an abstraction to the abstraction. And essentially when ever you want to deal with the users table in the database, you would use its own variation of class methods like Add(), Remove(), Find(), etc. and it in turn would use the ORM class with Save(), Delete(), Select(), and so on to accomplish its goals. I can see its advantages and disadvantages.

To me, raising up to this level of abstraction introduces a longer timeline on completing projects. I mean, I'd have to create a class for every meaningful, frequently-used table in my database design, and make it extend the ORM class. New developers coming to the project would then not only have to learn the ORM class, but learn the methods of that abstracted class. If they are in the mindset of getting certain data in and out of the database, they may be frustrated on a new project, where we may see like 50 of these abstracted classes, to have to learn the ins and outs of them instead of jumping right into the ORM class to get what they want.

On the other hand, if you are picky about people interacting directly with the database without following a certain set of ground rules, having this abstraction class on top of ORM might fit the bill.

So, as a freelancer working alone, trying to meet tough deadlines and beat the competition, I can see myself appreciating just a few small layers of abstraction with the DB and ORM class. But on a large project with several teammates, where it's already established that there's going to be a learning curve for new devs because this is a large project, then I can see the advantage of your abstraction suggestion to extend the ORM class.

Re: Rolling Your Own ORM

Posted: Fri Dec 05, 2008 11:02 am
by Christopher
volomike wrote:To me, raising up to this level of abstraction introduces a longer timeline on completing projects.
If you consider typing your one line much more time consuming that my four, then yes. But if you put your line more than one place in the application -- and need to change it -- now I am saving time over you. In my experience, changes are what cut into my profit.

Re: Rolling Your Own ORM

Posted: Fri Dec 05, 2008 11:21 am
by Eran
To me, raising up to this level of abstraction introduces a longer timeline on completing projects. I mean, I'd have to create a class for every meaningful, frequently-used table in my database design, and make it extend the ORM class.
This is actually the most common usage of ORM and other database table abstractions. This kind of class is called a Domain model, and repesents an entity in your application domain. Models become containers for logic associated with them, creating a natural scope for encapsulating domain logic.