Adodb-lite

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
Roja
Tutorials Group
Posts: 2692
Joined: Sun Jan 04, 2004 10:30 pm

Adodb-lite

Post by Roja »

I've run into several issues with implementing adodb-lite.

There is no session-clob support, which is needed for large object session variables on Postgresql. Since the adodb session table on postgresql uses a clob by default, thats a critical need.

It appears that CacheExecute is not implemented, is there an alternative?

The ADODB_Session::dataFieldName method is not available, is there an alternative?

ADODB_Session::encryptionKey method is not available, is there an alternative?

There is no perf mon class, is there an alternative?
User avatar
AKA Panama Jack
Forum Regular
Posts: 878
Joined: Mon Nov 14, 2005 4:21 pm

Re: Adodb-lite

Post by AKA Panama Jack »

Roja wrote:I've run into several issues with implementing adodb-lite.

There is no session-clob support, which is needed for large object session variables on Postgresql. Since the adodb session table on postgresql uses a clob by default, thats a critical need.
Postgresql doesn't even have a CLOB data type. I spent the last hour and a half reading up on Postgresql and Clob/Blob support. You will find that If you check the datadictionary for ADOdb you will find they use the TEXT datatype for large character fields. And with Postgresql 7 and higher the VARCHAR datatype can use very large values like VARCHAR(4000) for a fixed 4,000 character field. Basically if you enable CLOBs on a Postgresql session you will greatly slow down the session handler as it executes a bunch of totally useless code to store virtually the same data into a TEXT field. Yes, the ADOdb session handler uses a TEXT datatype when Postgresql is selected.

Out of all of databases that ADOdb supports only Oracle actually has a CLOB datatype. Enabling CLOBs on any database other than Oracle doesn't do anything but increase the execution time of the session handler.

If you are using the UpdateClob function in ADOdb for Postgresql all you are doing is this...

Code: Select all

$this->Execute("UPDATE $table SET $column=? WHERE $where",array($val))
You can do that without using the UpdateClob function and save yourself a step.

So no, I will not add CLOB support for the session handler until I add Oracle support.
Roja wrote:It appears that CacheExecute is not implemented, is there an alternative?
Now this is a sticking point. I will eventually create a Caching Module for ADOdb Lite but I cannot see any real improvement in doing so as caching has a couple of drawbacks.

But there is a misnomer about caching and adodb. Caching only helps if you are accessing huge tables, with poor indexing, that have queries scanning the entire table before sending the data to the client.

In otherwords caching will only help if you have tables with hundreds of thousands of entries with a large number of fields and no limit queries on those fields. No limit queries will cause a scan of the entire table for the data on most databases.

Also, the overhead in ADOdb for reading and unserializing in the serialized data from the file it was stored in can take as much overhead as the original query if the initial query returned a large resultset.

I have performed a number of tests on ADOdb using both cached and uncached queries and most of the time the uncached queries were faster. This is because mysql can usually parse the table faster than PHP can perform the load and unserialize of the data. Now if the stored resultset is incredably SMALL then you can have a speed advantage. Caching in ADOdb CAN be faster than the database queries if the database system being used is slow by default but databases like MySql do not have this problem from my testing.

There are two advantages to using caching queries.

1. The CPU load is transfered to the client server from the database server.

2. The cached query CAN be faster if the database server is storing the data on very slow hard drives or the database itself is very slow.

But in many cases caching will not speed up the queries but can slow them down.

Another disadvantage to the caching scheme used in ADOdb is it cannot take into account changes to the database. If you use the default 60 MINUTE cache time in ADOdb then no changes to the database will be visable until 60 minutes after the initial cached query is executed. You could have 1,000 changes to the database table during that 60 minute period. None of them will show up in the cached query until 60 minutes has timed out.

In otherwords the caching system in ADOdb is very rudamentary and cannot detect when there have been changes made to the database tables. So old outdated data is used by the cached query. If ADOdb was able to detect when a table was updated then it could clear the cached data and execute a noncached query to rebuild the cache data. But sadly that is not possible and many databases do not offer the ability to detect that a table has been updated. In this case caching should only be used on tables that do not update their data frequently.

If you are using MySql you would be better off using the built in caching features it has instead of caching in ADOdb. If caching is enabled in Mysql then all queries are automatically cached and if a table is updated it knows to clear the cache for that query and rebuild it for you. The
built in caching in Mysql is also MANY times faster than anything ADOdb or any PHP based caching program could come up with.

MySql also has the ability to check when a table has been last modified. When I eventually create the caching module for MySql it will not use the 60 minute cache delay but will check when the table was last updated. This will atleast make the MySql versions of the cache module intelligent enough to rebuild the query data when it has been previously updated.

I plan on adding caching to ADOdb Lite but it is a very low priority at the moment due to caching not being that much better than straight queries.
Roja wrote:The ADODB_Session::dataFieldName method is not available, is there an alternative?
No, I did not support that function but it was easy to add. :) For the next release the dataFieldName function is supported.
Roja wrote:ADODB_Session::encryptionKey method is not available, is there an alternative?
It is there... Scroll to the bottom of the class and you will see the function.
Roja wrote:There is no perf mon class, is there an alternative?
No, the performance monitor is not support at this time. When I have the time I will be porting it as it will need to be rewritten so it can interface with ADOdb Lite. Plus, I will need to get the meta_module completed for all databases. The meta module will contain all of the seldom used meta functions.
Roja
Tutorials Group
Posts: 2692
Joined: Sun Jan 04, 2004 10:30 pm

Re: Adodb-lite

Post by Roja »

AKA Panama Jack wrote:So no, I will not add CLOB support for the session handler until I add Oracle support.
Okay, not supported.
AKA Panama Jack wrote:I plan on adding caching to ADOdb Lite but it is a very low priority at the moment due to caching not being that much better than straight queries.
Not supported.
AKA Panama Jack wrote:No, I did not support that function but it was easy to add. :) For the next release the dataFieldName function is supported.
Awesome.
AKA Panama Jack wrote:
Roja wrote:ADODB_Session::encryptionKey method is not available, is there an alternative?
It is there... Scroll to the bottom of the class and you will see the function.
Odd. It was already coded to use it in BNT, and it threw the error. But you are right, I do see it in the file. Not sure on this one. I'll tinker a bit.
AKA Panama Jack wrote:No, the performance monitor is not support at this time. When I have the time I will be porting it as it will need to be rewritten so it can interface with ADOdb Lite. Plus, I will need to get the meta_module completed for all databases. The meta module will contain all of the seldom used meta functions.
Yeah, thats a killer feature for me. Can't possibly leave ADOdb without it.

Like I said in the other thread, there are just a number of features that I use, that probably aren't priorities for you. I won't drag this out with the gory details about the clob support, or arguing the point about caching. Both are already in use, actively, and contributing to my codebase for a reason.

Until adodb-lite provides feature parity, its not a competitor for me. I definitely appreciate the memory and overhead reductions, but for what *I* need to do, its simply not in the running yet.
User avatar
AKA Panama Jack
Forum Regular
Posts: 878
Joined: Mon Nov 14, 2005 4:21 pm

Re: Adodb-lite

Post by AKA Panama Jack »

Roja wrote:Like I said in the other thread, there are just a number of features that I use, that probably aren't priorities for you. I won't drag this out with the gory details about the clob support, or arguing the point about caching. Both are already in use, actively, and contributing to my codebase for a reason.
Actually, if you use the ADOdb Session code the way it comes and how they indicate it SHOULD be setup it will not create an a bytea field for the data but it will create a TEXT field. You basically have to force it which breaks the ADOdb meta fields.

The session_schema.xml from ADOdb uses this for the data field.

Code: Select all

<field name="DATA" type="XL">
      <descr></descr>
      <NOTNULL/>
    </field>
The XL data type. Now check the ActualType function for the metafields in the postgres datadictionary. It's text. :) Sure you can modify it to anything you want but with postgres there isn't any need to use clobs as the varchar and text datatypes work exactly as they do with a database like mysql. And if you look at the code for ADOdb sessions the only real clob support is for Oracle.

Now there really isn't any reason to use a binary field for sessions unless you are using mcrypt or secret. MD5 uses the base64_encode function to convert the binary to text. I bet you are using MD5 since not every server will have mcrypt installed but MD5 can be used about anywhere. So the data isn't stored binary.

The SHA1 encryption I included also uses base64_encode/decode functions to store the data as text.

And thanks for point that out about the binary as I did forget all about that for mcrypt. The secret compression uses mcrypt if installed and then falls through to the HCEMD5 module in PEAR. So Secret returns a binary format. I will be changing the mycrypt and Secret encryptions to use base64_encoding so binary fields are nolonger needed. That will also allow me to remove the binary checks for mysql and make the sessions even faster. To be honest they need to make these changes to the mcrypt and secret encryption modules in ADOdb as well as binary data should never be stored in tables.

Once those changes are made there will be absolutely no reason to need clob support except for Oracle which doesn't have a variable length TEXT datatype. :) Makes things alot easier to maintain. Actually if you want to use a CLOB field with the updated session code while using Postgresql you can. It will not matter to the session handler it will store the data properly to the CLOB field since it isn't binary data and escapable characters are not used you do not need to do any preprocessing of the data. :)
Roja
Tutorials Group
Posts: 2692
Joined: Sun Jan 04, 2004 10:30 pm

Re: Adodb-lite

Post by Roja »

AKA Panama Jack wrote:Actually, if you use the ADOdb Session code the way it comes and how they indicate it SHOULD be setup it will not create an a bytea field for the data but it will create a TEXT field. You basically have to force it which breaks the ADOdb meta fields.
I can't use the session code the way it comes, and John Lim even specifically tells you you shouldnt - on particular database types.
John Lim wrote: You may want to rename the 'data' field to 'session_data' as
'data' appears to be a reserved word for one or more of the following:
ANSI SQL
IBM DB2
MS SQL Server
Postgres
SAP

If you do, then execute:

ADODB_Session::dataFieldName('session_data');
So I do just that and then manually create the table. If I don't manually create it, it uses the "data" table name, which causes an error.

Or put another way, I have to force it, per the documentation, or else it breaks the table creation on postgresql.

But thats just an arguing point, which doesn't deal with the main issue of binary v. text..
AKA Panama Jack wrote:The session_schema.xml from ADOdb uses this for the data field.

The XL data type. Now check the ActualType function for the metafields in the postgres datadictionary. It's text. :) Sure you can modify it to anything you want but with postgres there isn't any need to use clobs as the varchar and text datatypes work exactly as they do with a database like mysql. And if you look at the code for ADOdb sessions the only real clob support is for Oracle.
On the contrary, the "B" type works perfectly well in adodb on postgresql. Perhaps you should try it out before implying that it isn't "really supported". I did the same for your product. ;)
AKA Panama Jack wrote:Now there really isn't any reason to use a binary field for sessions unless you are using mcrypt or secret. MD5 uses the base64_encode function to convert the binary to text. I bet you are using MD5 since not every server will have mcrypt installed but MD5 can be used about anywhere. So the data isn't stored binary.
You left out all of the compression options: gzip and bzip2. Both store the data in binary format. The idea around using base64_encoding is interesting though. I'll mention it to John for his crypt and compression functions, but it doesn't remove the need for binary data support. (It just removes the need from the session table)
AKA Panama Jack wrote:To be honest they need to make these changes to the mcrypt and secret encryption modules in ADOdb as well as binary data should never be stored in tables.
Thats another personal choice v. functionality choice. You feel it should "never" be stored in tables.

The reality is that there are often situations where people using databases will in fact need to store binary data. Whether its storing pictures, gzipped templates, compressed session data, or something else, there are plenty of use cases where it can be useful.

All of which ignores the key point: I already use it in my game, and for a good reason, and as I said originally, adodb-lite doesn't support it, while adodb does.

Its not some abstract "Gee, I wish it did wiz-bang thing" comment. Its a real-world, "I use this and adodb-lite doesn't support it" finding. Whether you agree or disagree with the purpose doesn't change the lack of support.

Not to mention, binary data in sql tables has been a feature in the SQL standard for over a decade. Clearly, it has some uses. Besides, the binary field was one out of three major features that adodb-lite didn't support that adodb does.

When perfmon, binary support, and caching hit adodb-lite, then I'll take another look. Until then, its just.. "lite".
User avatar
AKA Panama Jack
Forum Regular
Posts: 878
Joined: Mon Nov 14, 2005 4:21 pm

Re: Adodb-lite

Post by AKA Panama Jack »

Roja wrote:
AKA Panama Jack wrote:Actually, if you use the ADOdb Session code the way it comes and how they indicate it SHOULD be setup it will not create an a bytea field for the data but it will create a TEXT field. You basically have to force it which breaks the ADOdb meta fields.
I can't use the session code the way it comes, and John Lim even specifically tells you you shouldnt - on particular database types.
John Lim wrote: You may want to rename the 'data' field to 'session_data' as
'data' appears to be a reserved word for one or more of the following:
ANSI SQL
IBM DB2
MS SQL Server
Postgres
SAP

If you do, then execute:

ADODB_Session::dataFieldName('session_data');
So I do just that and then manually create the table. If I don't manually create it, it uses the "data" table name, which causes an error.

Or put another way, I have to force it, per the documentation, or else it breaks the table creation on postgresql.
What he is warning about has absolutely nothing to do with binary data or a clob being used by postgreql. The warning deals with the fact that some databases have the word data classified as a reserved word. I am surprised that he would leave the field name set to data since it is a reserved word with so many database packages. The session library I included with ADOdb Lite uses session_data instead of the reserved word data as the default name.
Roja wrote:
AKA Panama Jack wrote:The session_schema.xml from ADOdb uses this for the data field.

The XL data type. Now check the ActualType function for the metafields in the postgres datadictionary. It's text. :) Sure you can modify it to anything you want but with postgres there isn't any need to use clobs as the varchar and text datatypes work exactly as they do with a database like mysql. And if you look at the code for ADOdb sessions the only real clob support is for Oracle.
On the contrary, the "B" type works perfectly well in adodb on postgresql. Perhaps you should try it out before implying that it isn't "really supported". I did the same for your product. ;)
Yes, it does work but it is one of the most INEFFICIENT ways of storing binary data in a table field. I never said it doesn't work but that the default settings for postgreql were a text field and not clob.

If you are using the bytea field type in postgresql then the data being stored can be 4-5 times LARGER than the binary data. This is because over half of the byte values (0 to 31 and 127 to 255) have to be escape quoted to represent \\xxx before the data is sent to the database. You can convert the binary data to hex using bin2hex and only increase the size of the data being stored by 2 times. This is atleast half the size of the data being stored using the bytea field type. If you use the base64_encode function in PHP the data string being stored is even smaller than using bin2hex.

Also, the BLOB functions in ADOdb do not convert binary data to bytea format properly if you are using a version of PHP less than 4.2.0. If you have selected CLOB in postgres it only executes qstr on the clob data you want to store. This will not encode binary data properly to store in the bytea field. Only the BLOB functions in ADOdb will properly convert the data if you are using PHP 4.2.0 or higher. Properly encoded bytea binary data will almost always be from 2-4 times larger than the same binary data encoded with base64_encode or bin2hex and stored in a text field.

Also, if you are using CLOB with postgresql in ADOdb's session handler the binary data will be stored in the table improperly encoded.

The following line processes the CLOB data...

Code: Select all

$this->Execute("UPDATE $table SET $column=" . $this->qstr($val) . " WHERE $where");
BTW, you need to remember this CLOB fields are for alpha-numeric data only. If you want to store binary data you have to use BLOB fields. You are not supposed to use CLOBs for binary data.

This is why it calls the qstr function to encode the CLOB data. This encodes text or numeric data properly but BINARY data (0 to 31 and 127 to 255) are not escape encoded as they are required to be for storage in a bytea field. If you are using MD5 encoding then the data is stored properly but if you use compression or mcrypt encoding it will fail. I bet you have been using uncompressed MD5 encoded sessions so far. MD5 and unencoded session data is the only data that will be properly stored in a bytea field when you have selected CLOB for the session handler.

There are alot of problems with ADOdb's Session Handler when it comes to binary data. If you really examine the code the only databases that really support binary data storage by the session handler is Mysql.

With ADOdb Lite I have corrected these problems for the next release. All of the supported databases will work with any binary data without having to use binary flags or fields.

Even Oracle will NOT store binary data properly in CLOBs. CLOBs are for CHARACTER objects. If you wish to store binary data in Oracle you must use BLOBs for binary objects.
Roja
Tutorials Group
Posts: 2692
Joined: Sun Jan 04, 2004 10:30 pm

Re: Adodb-lite

Post by Roja »

AKA Panama Jack wrote:What he is warning about has absolutely nothing to do with binary data or a clob being used by postgreql.
I was answering your comment that I should be using the default adodb table. I cannot, as he uses a reserved word.

So, I use a xmlschema. The xmlschema I use has the session data field as a binary - because I use compression, which stores as binary data.
AKA Panama Jack wrote:Yes, it does work but it is one of the most INEFFICIENT ways of storing binary data in a table field.
Not an issue. Its compressed with bzip, and in many cases is over 10-20 times smaller than the original data. So even with the increase in size from binary storage, its still a net win.
AKA Panama Jack wrote:Also, the BLOB functions in ADOdb do not convert binary data to bytea format properly if you are using a version of PHP less than 4.2.0.
Not an issue. We already bumped our requirements above that due to security issues, and several functions we use.
AKA Panama Jack wrote:If you are using MD5 encoding then the data is stored properly but if you use compression or mcrypt encoding it will fail.
My testing shows bzip2 and gzip encoding working fine on it. Not sure what you mean there. I've thrown an enourmous amount of data of various types through the session handler, and it seems to store them all fine.

Do you have a testcase you can share that would show this failure?
AKA Panama Jack wrote:There are alot of problems with ADOdb's Session Handler when it comes to binary data. If you really examine the code the only databases that really support binary data storage by the session handler is Mysql.
My experience shows otherwise. The postgresql support is working great for me.
AKA Panama Jack wrote:With ADOdb Lite I have corrected these problems for the next release. All of the supported databases will work with any binary data without having to use binary flags or fields.
I suggested your idea of using base64_encoding at the adodb forums. John seemed interested, and asked for a patch to implement the same. I'm working on one for him now. Glad to see you are following up on it in adodb-lite.

That still doesn't remove the need for binary support in general, but for my needs, it removes a blocker.

That just leaves perfmon and caching for feature parity (at least of the features *I* need).
User avatar
AKA Panama Jack
Forum Regular
Posts: 878
Joined: Mon Nov 14, 2005 4:21 pm

Post by AKA Panama Jack »

I will bet you dollars to donuts that if you are using any of the REAL binary options (mcrypt, bzip2, gzip or secret) for ADOdb's Session support under postgres with CLOBs enabled that your session data is getting corrupted and you just haven't realised it yet. :)

If you actually examine the code for CLOB support you will find that it will NOT store properly escape encoded binary data in a bytea field.

Code: Select all

// from adodb-session.php

// $clob = "clob" if clob support enabled.

$rs2 =& $conn->UpdateBlob($table, $data, $val, " sesskey=$qkey", strtoupper($clob));
The above line from the session handler calls the UpdateBlob function from the portgres64 driver.

Code: Select all

// from adodb-postgres64.inc.php (all other postgres drivers extend this class)  The session handler will set $blobtype to CLOB if clob has been enabled.

// assumes bytea for blob, and varchar for clob
function UpdateBlob($table,$column,$val,$where,$blobtype='BLOB')
{

	if ($blobtype == 'CLOB') {
		return $this->Execute("UPDATE $table SET $column=" . $this->qstr($val) . " WHERE $where");
	}
	// do not use bind params which uses qstr(), as blobencode() already quotes data
	return $this->Execute("UPDATE $table SET $column='".$this->BlobEncode($val)."'::bytea WHERE $where");
}
The code above including the second comment is directly from the ADOdb Postgresql driver. As you can see CLOB's under Postgres in ADOdb are assumed to be using VARCHAR fields and not bytea files so they only escape encode the data as an alpha-numeric string and not a binary string. The binary data in these ranges (0 to 31 and 127 to 255), which are directly from the postgresql manual, will not be binary escape encoded and not stored properly.

Trust me, you can have ANY PHP programmer who knows the postgres database look at the code and they will tell you exactly what I have said.
Roja
Tutorials Group
Posts: 2692
Joined: Sun Jan 04, 2004 10:30 pm

Post by Roja »

AKA Panama Jack wrote:I will bet you dollars to donuts that if you are using any of the REAL binary options (mcrypt, bzip2, gzip or secret) for ADOdb's Session support under postgres with CLOBs enabled that your session data is getting corrupted and you just haven't realised it yet. :)
As I said, I'm doing so, and I'm sending a ton of data through it, and I'm not seeing any corruption.

If you have a test case, I'm happy to try it out.
AKA Panama Jack wrote: Trust me, you can have ANY PHP programmer who knows the postgres database look at the code and they will tell you exactly what I have said.
I'm a PHP programmer, I know postgresql, and my results are the opposite.

I trust test results. Show me a testcase, and I'll be happy to test your opinion. Until then, my real-world experience is showing the opposite so far.
User avatar
AKA Panama Jack
Forum Regular
Posts: 878
Joined: Mon Nov 14, 2005 4:21 pm

Post by AKA Panama Jack »

Actually I have tested it and it fails. It is pretty easy to test by echoing what is being sent encoded to the database and what the decoded results are when it is being retrieved. In the adodb-session.php (version 4.67) after line 603 ($v = rawurldecode($v);) in the read function insert...

echo $v;

or

$conn->outp($v);

and place BEFORE line 657 ($val = rawurlencode($val);)

echo $v;

or

$conn->outp($v);

The displays of both should always be identical. On my server using CLOB with bytea fields the data is not the same when using mycrypt/gzip or any other binary method under ADOdb. Now if I set it to a BLOB using...

ADODB_Session::clob('BLOB');

Then the data is escape coded properly and the outputs match. But if you use ADODB_Session::clob('CLOB'); the outputs never match.

Actually you could enter anything other than 'CLOB' or '' in the ADODB_Session::clob(); function and it will use BLOB storage. ADODB_Session::clob('JustJunk'); will also make the session handler use BLOB storage.

So maybe you haven't entered...

ADODB_Session::clob('CLOB');

but something else and it is using BLOB storage and you just never realised it. On my server using a bytea field with ADODB_Session::clob('CLOB'); the data being pulled from the session table doesn't match the data that was stored.
Post Reply