ORM Database Abstraction

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

User avatar
allspiritseve
DevNet Resident
Posts: 1174
Joined: Thu Mar 06, 2008 8:23 am
Location: Ann Arbor, MI (USA)

Re: ORM Database Abstraction

Post by allspiritseve »

VirtuosiMedia wrote:Could you elaborate a little more about your suggestion and what advantages/disadvantages it might offer over this method?
Sure. I see your class doing two things: constructing a query, and wrapping the result. You should always endeavor that your classes have one function and one function only. Here's an example of how you can split the querying code away from the result wrapping code:

Code: Select all

 
class UserGateway {
// methods removed
function getByLastName ($lastName) {
$query = new query ($this->db);
$query->select();
$query->where('lastName', '=', $lastName);
$query->order_by('lastName', 'ASC');
$query->limit(5);
$query->offset(5);
$query->execute();
foreach ($query->result as $row):
$users[] = new User ($row);
endforeach;
return $users;
}
function getById ($id) {
$query = new query ($this->db);
$query->select();
$query->where('user_id', '=', $id);
$query->limit(1);
$query->execute();
return new User ($query->result);
}
}
 
This way, you can use the UserGateway like this:

Code: Select all

$gateway = new UserGateway ($db);
$user = $gateway->getById (1);
-OR-
$user = $gateway->getByLastName ('Smith');
That is what I meant by the gateway constructing your query object. A better term would actually be composition. Because the query building code is contained inside the gateway, the client code doesn't need to know its dealing with a database... could be a flatfile, a service, etc. All it knows is how to ask the gateway for the appropriate results, in the form of User objects.

Gotta run, my birthday dinner's ready :D

Cory
User avatar
VirtuosiMedia
Forum Contributor
Posts: 133
Joined: Thu Jun 12, 2008 6:16 pm

Re: ORM Database Abstraction

Post by VirtuosiMedia »

allspiritseve wrote:
VirtuosiMedia wrote:Could you elaborate a little more about your suggestion and what advantages/disadvantages it might offer over this method?
Sure. I see your class doing two things: constructing a query, and wrapping the result. You should always endeavor that your classes have one function and one function only. Here's an example of how you can split the querying code away from the result wrapping code:

Code: Select all

 
class UserGateway {
// methods removed
function getByLastName ($lastName) {
$query = new query ($this->db);
$query->select();
$query->where('lastName', '=', $lastName);
$query->order_by('lastName', 'ASC');
$query->limit(5);
$query->offset(5);
$query->execute();
foreach ($query->result as $row):
$users[] = new User ($row);
endforeach;
return $users;
}
function getById ($id) {
$query = new query ($this->db);
$query->select();
$query->where('user_id', '=', $id);
$query->limit(1);
$query->execute();
return new User ($query->result);
}
}
 
This way, you can use the UserGateway like this:

Code: Select all

$gateway = new UserGateway ($db);
$user = $gateway->getById (1);
-OR-
$user = $gateway->getByLastName ('Smith');
That is what I meant by the gateway constructing your query object. A better term would actually be composition. Because the query building code is contained inside the gateway, the client code doesn't need to know its dealing with a database... could be a flatfile, a service, etc. All it knows is how to ask the gateway for the appropriate results, in the form of User objects.

Gotta run, my birthday dinner's ready :D

Cory
Happy Birthday! Hope you have a great one.

Ok, I see what you're saying. However, I think it sacrifices way too much flexibility to still meet my goals. Let's say, for instance, I also want to get the userGroup, which is stored as a foreign key in the users table. With your method, I would have to create a whole new function or class or modify existing functions. And what if I only want the userGroup sometimes? Do I create another parameter for each method? And if I want it to display 10 results rather than 5, I again come to the choice of adding another parameter or another function altogether. Also, if I decide later on that I want to add a middleName column to my users table, I would have to create another function for that. And that's just for getting the data, nevermind updating it, deleting it, or creating it. The choice quickly becomes very limited functionality with almost no flexibility or an ever-increasing number of methods to cover all the different uses. Now multiply that by every single table in your database. That's a major reason why I chose not to internalize the query construction.

The way I have it set up, I don't have to create each function by hand, it's handled dynamically. So instead of writing this code:

Code: Select all

 
class UserGateway {
    // methods removed
    function getById ($id) {
        $query = new query ($this->db);
        $query->select();
        $query->where('user_id', '=', $id);
        $query->limit(1);
        $query->execute();
        return new User ($query->result);
    }
    function getByFirstName ($firstName) {
        $query = new query ($this->db);
        $query->select();
        $query->where('firstName', '=', $firstName);
        $query->order_by('firstName', 'ASC');
        $query->limit(5);
        $query->offset(5);
        $query->execute();
        foreach ($query->result as $row):
        $users[] = new User ($row);
        endforeach;
        return $users;
    }
    function getByLastName ($lastName) {
        $query = new query ($this->db);
        $query->select();
        $query->where('lastName', '=', $lastName);
        $query->order_by('lastName', 'ASC');
        $query->limit(5);
        $query->offset(5);
        $query->execute();
        foreach ($query->result as $row):
        $users[] = new User ($row);
        endforeach;
        return $users;
    }
    function getByUserName ($userName) {
        $query = new query ($this->db);
        $query->select();
        $query->where('userName', '=', $userName);
        $query->order_by('userName', 'ASC');
        $query->limit(5);
        $query->offset(5);
        $query->execute();
        foreach ($query->result as $row):
        $users[] = new User ($row);
        endforeach;
        return $users;
    }
    function getByEmail ($email) {
        $query = new query ($this->db);
        $query->select();
        $query->where('email', '=', $email);
        $query->order_by('email', 'ASC');
        $query->limit(5);
        $query->offset(5);
        $query->execute();
        foreach ($query->result as $row):
        $users[] = new User ($row);
        endforeach;
        return $users;
    }   
}
I can write this code instead:

Code: Select all

 
<?php
class users extends DBObject {
    function __construct() {
        global $DB2;
        parent::__construct($DB2, 'users', array('userId', 'username', 'firstName', 'lastName', 'email'), 'none');
    }
 
}
?>
 
Adding new columns is easy, all I would have to do to add age, weight, and middleName is change the above to this:

Code: Select all

 
<?php
class users extends DBObject {
    function __construct() {
        global $DB2;
        parent::__construct($DB2, 'users', array('userId', 'username', 'firstName', 'middleName', 'lastName', 'email', 'age', 'weight'), 'none');
    }
 
}
?>
 

That then will dynamically create the access to the data. Granted, I still need to write the query, but if I need a slightly different type of data returned, I don't have to worry about making new methods so that I don't break other instances that use similar ones, I just adjust the query slightly. Accessing the data won't be a whole lot different from your method, but this ends up being a lot more flexible and uses a lot less code.

Right now, before doing any serious optimizing work on it, my MySQL DBObject that constructs the queries is 600 lines of code and I would expect that the DBObjects for PostgreSQL, Oracle, etc would be in the same ballpark. Each of those will handle most of your CRUD functions and have support for all of the methods I mentioned in my first post. Each class that extends the DBObject is what you see above, about 10 lines. I have no idea how that compares to similar projects in terms of efficiency or speed, but it seems somewhat reasonable to me. Just looking over Zend DB, it looks like the select.php file alone is about 1100 lines of code. They handle things a little differently than this does and have a different set of features, though, so I'm not sure if it's the greatest comparison. Like I said, though, I still have to run benchmarks and tests to see how much overhead everything adds.
User avatar
allspiritseve
DevNet Resident
Posts: 1174
Joined: Thu Mar 06, 2008 8:23 am
Location: Ann Arbor, MI (USA)

Re: ORM Database Abstraction

Post by allspiritseve »

Well, you're actually gaining flexibility in the query class-- how would you do a join with your class? Or select from multiple tables? (i.e. inner join). Or subqueries? If your class only deals with the users table, it loses the flexibility inherent in an SQL query-- you're restricted to getting data from one table only.

If your query is simply a wrapper for the database, and handles the DB abstraction, you have all the flexibility and no constraints. That way, you can "dynamically" pull any data you want, from any number of tables, in any order. The same way you can in raw SQL.

Now, if you still want to deal in users (rather than queries) I would still suggest a table data gateway or data mapper. If you only want the usergroup sometimes, use Lazy Load. If you want 5 results, just grab the first 5 from the array it gives you. Or overload the method with order and limit vars. Either would work... but I agree there are positives and negatives to both raw sql and gateways/mappers.

One more question-- what happens when your domain object, ie users, has a property that isn't in the database? Say, $full_name (a composite of $first_name and $last_name). It could be more complex, maybe the number of posts a user has, or the number of contacts they have in an address book. Those types of things aren't fields in the users table. How does your class handle those?
User avatar
VirtuosiMedia
Forum Contributor
Posts: 133
Joined: Thu Jun 12, 2008 6:16 pm

Re: ORM Database Abstraction

Post by VirtuosiMedia »

allspiritseve wrote:Well, you're actually gaining flexibility in the query class-- how would you do a join with your class? Or select from multiple tables? (i.e. inner join). Or subqueries? If your class only deals with the users table, it loses the flexibility inherent in an SQL query-- you're restricted to getting data from one table only.

If your query is simply a wrapper for the database, and handles the DB abstraction, you have all the flexibility and no constraints. That way, you can "dynamically" pull any data you want, from any number of tables, in any order. The same way you can in raw SQL.

Now, if you still want to deal in users (rather than queries) I would still suggest a table data gateway or data mapper. If you only want the usergroup sometimes, use Lazy Load. If you want 5 results, just grab the first 5 from the array it gives you. Or overload the method with order and limit vars. Either would work... but I agree there are positives and negatives to both raw sql and gateways/mappers.

One more question-- what happens when your domain object, ie users, has a property that isn't in the database? Say, $full_name (a composite of $first_name and $last_name). It could be more complex, maybe the number of posts a user has, or the number of contacts they have in an address book. Those types of things aren't fields in the users table. How does your class handle those?
Good questions. It can handle joins fairly easily, but I had to limit it to the join types that are shared across the major database types (left, right, inner, and full). It handles subqueries as well, but I'm not completely sold on the way they are implemented, but I haven't thought of a better solution yet.

Join syntax would be as follows:

Code: Select all

 
$users = new users();
$users->alias('u'); //Gives the main table an alias of 'u'
$users->add_join('left', 'userGroups', 'ug.userId', '=', 'u.userId', 'ug');
$users->select();
 
The above code outputs the following MySQL query:
SELECT * FROM tst_users AS u LEFT JOIN tst_userGroups AS ug ON ug.userId = u.userId
(Above you can see that the table is tst_users rather than users and tst_userGroups. The class can also handle prefixes for databases, even if they are different across systems. It just needs to be specified in a DB_PREFIX constant in the DB connection script.)

The parameters for add_join are:

Code: Select all

add_join($join_type, $table, $column, $operator, $value, $table_alias=NULL)
I'm not totally satisfied with the subqueries syntax. Currently they're built the same way regular queries are, the only difference is that you insert 'subquery' in the optional mode parameter for the select method, which will cause it to return the entire query without executing it.

Code: Select all

 
$users = new users();
$users->where('lastName', '=', 'Smith');
$subquery = $users->select('subquery');
 
The $subquery variable could then be inserted as a parameter when you build your main query. However, I don't know if this is the most efficient way of doing that. It might be fine for small data sets, but it would probably bog down larger data sets. I still have to benchmark it.

You can specify which fields you want by using the select_list() function. It takes an array of the fields you want returned in your query. You can also use it to include aggregate functions, but to keep it portable, you would have to limit the function to something that is common across databases (which the class doesn't validate, but I have a list). The aggregate function is accessed by its full text, the same way a regular field would be.

Code: Select all

 
$users->select_list(array('AVG(age)'));
$users->select('single');
 
echo $users->{'AVG(age)'};
 
If you want to get a simple count of something, you can build your query the way you normally would with this class, and then enter 'count' in as the mode parameter for select().

Code: Select all

 
$users->select('count');
 
If you need to run one query with the users class and then another different query, you can clear the entire query by using the clear_all() function, or you can specify which methods you'd like to clear by using the clear() function:

Code: Select all

 
$users->clear('select_list') //Sets the select_list() method to null, as if it never happened
$users->clear_all(); //nulls all methods
 
As I get my folder organization together a little more, the classes that correspond to tables will probably be prefixed by DB_ so that they can be autoloaded.

The class was written so developers can write code once and have it portable across database types. However, that doesn't mean that it necessarily has to be used that way. It's definitely not perfect and it doesn't cover all of the intricacies of each flavor of SQL, but it can be extended for database-specific optimization and functionality. It is basically just a wrapper for the DB abstraction, but the parts that look like ActiveRecord are just meant to give you easier access to the data and an easy way of setting values before record updates or creation.


Last bumped by VirtuosiMedia on Tue Jul 01, 2008 2:34 am.
Post Reply