Statiscal lazy loading idea
Posted: Sun Nov 30, 2008 5:48 pm
Something I'd like to see in an ORM in the future is statistically influenced loading.. imaging you could define "watch points" in your model and have the ORM track use cases to tune performance on...
Hmmm... maybe an example (in pseudo code):
ok... so a user belongs to zero or more groups, has many posts (blog/forum?) that are ordered by date, descending when accessed through the model.
Now with traditional ORMs, the ORM will make decisions about which fields to retrieve based on field type (or not at all!)... and most will allow us to override lazy/eager-loading at the model layer, and sometimes when loading the Object (explicitly, in the controller).
Sooo.... how about an intelligent, predictive decision based on past experience? The ORM could track certain conditions and work out the cost/benefit of grabbing certain columns / associations based on past behaviour. For instance, maybe for a user the "name" property is accessed almost all the time, whereas the "username" property is only accessed 5% of the time (during login). Or better yet - the "description" is accessed only when a user's profile is shown, but the signature is accessed every time a post is viewed. Since these are both Text columns, they incur a relatively large cost to the DB that is sometimes, but not often necessary.
The ORM could track key usage stats in memory (or a HEAP table, depending on the platform/arch). And we could manually add specific data points to watch and analyze whenever a decision needs to be made. Maybe dump the totals to a long-term store every once in a while for safe-keeping, but it's not the kind of data we need to store every last piece of at all. Losing minutes, hours or even days of data would be increasingly inconsequential as time went on. Stats are cool like that.
In the example above, I added four watch points to the default set:
* watch the "groups" association - Maybe group membership alters what data is often needed from the DB
* watch the "id" property - decisions might be atomic per user, which would probably suck, performance-wise
* watch the number of posts a user has - maybe there are many users with zero posts, in which case a signature would almost never need to be loaded.
* watch the controller that made the request - i.e. the "Login" controller would need different fields on average than the "Posts" controller - no need to hard code the rules in the controller!
Anyway... this post is running a tad long... would love to hear some feedback.
Still
,
Kieran
Hmmm... maybe an example (in pseudo code):
Code: Select all
model:
class User extends SmartORM
# properties
property id, Integer, auto-inc
property username, String
property name, String
property description, Text
property signature, Text
# associations
belongs_to groups
has_many posts, order_by(date, desc)
# watch points
watch association groups
watch property id
watch count(posts)
watch Request.controller
endNow with traditional ORMs, the ORM will make decisions about which fields to retrieve based on field type (or not at all!)... and most will allow us to override lazy/eager-loading at the model layer, and sometimes when loading the Object (explicitly, in the controller).
Sooo.... how about an intelligent, predictive decision based on past experience? The ORM could track certain conditions and work out the cost/benefit of grabbing certain columns / associations based on past behaviour. For instance, maybe for a user the "name" property is accessed almost all the time, whereas the "username" property is only accessed 5% of the time (during login). Or better yet - the "description" is accessed only when a user's profile is shown, but the signature is accessed every time a post is viewed. Since these are both Text columns, they incur a relatively large cost to the DB that is sometimes, but not often necessary.
The ORM could track key usage stats in memory (or a HEAP table, depending on the platform/arch). And we could manually add specific data points to watch and analyze whenever a decision needs to be made. Maybe dump the totals to a long-term store every once in a while for safe-keeping, but it's not the kind of data we need to store every last piece of at all. Losing minutes, hours or even days of data would be increasingly inconsequential as time went on. Stats are cool like that.
In the example above, I added four watch points to the default set:
* watch the "groups" association - Maybe group membership alters what data is often needed from the DB
* watch the "id" property - decisions might be atomic per user, which would probably suck, performance-wise
* watch the number of posts a user has - maybe there are many users with zero posts, in which case a signature would almost never need to be loaded.
* watch the controller that made the request - i.e. the "Login" controller would need different fields on average than the "Posts" controller - no need to hard code the rules in the controller!
Anyway... this post is running a tad long... would love to hear some feedback.
Still
Kieran