Page 1 of 4

EAV attribtutes

Posted: Wed Sep 17, 2008 3:31 pm
by josh
stemming from a discussion on EAV in the general forum, would an EAV attribute table be bad practice in applications requiring custom fields on different models? I guess the ideal approach would be to create an actual table in a separate database and physically add the fields to it, for better performance? What are the pros and cons?

Re: EAV attribtutes

Posted: Wed Sep 17, 2008 10:56 pm
by allspiritseve
In my personal opinion, something like an EAV doesn't have as many drawbacks if its only used for a specific purpose and not for every single piece of code in the database. I'm thinking about something like facebook, if they offered custom data for user info, it wouldn't really be feasible to add a new column every time someone wanted a custom field. Either you'd end up with an inordinate amount of columns, or an inordinate amount of tables. In that situation, custom data would tend to be only one of a few types (a text field, and a longtext field, maybe) so you wouldn't have to manage a variety of different types of data (which the RDBMS does better than you can anyways).

Re: EAV attribtutes

Posted: Wed Sep 17, 2008 11:38 pm
by Christopher
I agree. It seem like EAV excels where you have lots of potential columns of data, but in reality you only tend to have a few actual columns at any time. Custom fields is a great example of this. Also tag based system where users define the tags.

Re: EAV attribtutes

Posted: Thu Sep 18, 2008 4:38 am
by josh
Does this change if, unlike facebook where the ratio of fields changing to reading info is very low, it were to be used in a system were that ratio is really high..

Webmasters could update their own site, maybe once a week.. and the website may query that info millions of times, and in complex ways before it changes again

http://mu.wordpress.org/

WordPress says they had problems with using a "god" table, that's why I'm worried.

Re: EAV attribtutes

Posted: Thu Sep 18, 2008 10:22 am
by Christopher
I don't think it is the number of reads, but the variability of the potential columns that makes people want to go to something like EAV.

For your example, this is where people often denormalize the database. If you have many reads, but infrequent writes, then it makes sense to do a lot more work when you do the write to get the data in a form that is optimized for read performance.

Re: EAV attribtutes

Posted: Fri Sep 19, 2008 8:14 am
by josh
Hmm, maybe theres a need for some kind of EAV wrapper that physically runs create and alter SQL, in traditional systems where users can define custom storage structures theres often shortcomings like performance hits on searching the data in advanced ways, and the need for an admin to use a "recount" record button. It would be nice if there was something out there that let you run arbitrary SQL on an arbitrary relational structure ( like ORM that worked on relational user defined fields )

Re: EAV attribtutes

Posted: Fri Sep 19, 2008 1:20 pm
by Christopher
jshpro2 wrote:Hmm, maybe theres a need for some kind of EAV wrapper that physically runs create and alter SQL, in traditional systems where users can define custom storage structures theres often shortcomings like performance hits on searching the data in advanced ways, and the need for an admin to use a "recount" record button. It would be nice if there was something out there that let you run arbitrary SQL on an arbitrary relational structure ( like ORM that worked on relational user defined fields )
I would be interested in building something like that. I have thought of implementing a simple EAV solution in Skeleton (I also thought a simple modified preorder tree traversal solution would be nice too). It would be stand-alone, so you could use with any DB library you wanted -- mainly a query builder and something that understand the metadata. It could be as simple as serialize/unserialize (powerofq would add eval() :) perhaps he can give us some tips on the design). There are probably more administrative code than runtime. It could almost generate a class from provided metadata. If you are interested let me know.

PS - for the modified preorder tree traversal code it would be a similar situation -- the read is simple, the write/admin code is more complex. I know you posited some code, but it does not allow you to tell the class: This is my table name; These are my column names, etc.

Re: EAV attribtutes

Posted: Fri Sep 19, 2008 1:33 pm
by josh
lol at the powerofq comment and yeah I wrote a whole a tree wrapper library and then canned it the next day lol.. I was gonna start adding meta data features but I realized I'd hit other walls

Re: EAV attribtutes

Posted: Fri Sep 19, 2008 1:46 pm
by Christopher
jshpro2 wrote:lol at the powerofq comment and yeah I wrote a whole a tree wrapper library and then canned it the next day lol.. I was gonna start adding meta data features but I realized I'd hit other walls
Well, maybe powerofq won't help! ;) What were the walls you hit on the tree code? And what do you think about a simple, generic EAV library?

Re: EAV attribtutes

Posted: Fri Sep 19, 2008 1:51 pm
by josh
download joomla + virtue mart and look in administrator/components/com_virtuemart/classes/ps_product_type_parameter.php

They already have a good start for us, I don't remember specifically the issue with the tree script but it was some administrative function I had in my requirements.. maybe it was moving a node and all of its sub nodes to a new parent node or something like that? Maybe there wasn't a limitation and maybe just a misunderstanding on my part of what was possible.. I'm sure anythings possible but if its insanely complex then there has to be a better way

Code: Select all

 
/**
     * updates Parameter information
     * @author Zdenek Dvorak
     * @param array $d
     * @return boolean
     */
    function update_parameter( &$d ) {
        $db = new ps_DB( ) ;
        
        if( $this->validate_update_parameter( $d ) ) {
            if( $d["parameter_old_type"] == "B" ) {
                // delete record and call add_parameter()
                $q = "DELETE FROM #__{vm}_product_type_parameter WHERE product_type_id='" . $d["product_type_id"] . "' " ;
                $q .= "AND parameter_name='" . $db->getEscaped(vmGet($d,'parameter_name')) . "'" ;
                $db->setQuery( $q ) ;
                $db->query() ;
                return $this->add_parameter( $d ) ;
            }
            // added for custom parameter modification
            // strips the trailing semi-colon from an values
            if( ';' == substr( $d["parameter_values"], strlen( $d["parameter_values"] ) - 1, 1 ) ) {
                $d["parameter_values"] = substr( $d["parameter_values"], 0, strlen( $d["parameter_values"] ) - 1 );
            }
            if( empty( $d["parameter_multiselect"] ) ) {
                $d["parameter_multiselect"] = "N" ;
            }
            // delete "\n" from field parameter_description
            $d["parameter_description"] = str_replace( "\r\n", "", $d["parameter_description"] );
            $d["parameter_description"] = str_replace( "\n", "", $d["parameter_description"] );
            
            $fields = array( 
                                        'parameter_name' => vmGet($d, 'parameter_name'),
                                        'parameter_label' => vmGet($d, 'parameter_label'),
                                        'parameter_description' => vmGet($d, 'parameter_description'),
                                        'parameter_list_order' => vmRequest::getInt('list_order'),
                                        'parameter_type' => vmGet($d, 'parameter_type'),
                                        'parameter_values' => vmGet($d, 'parameter_values'),
                                        'parameter_multiselect' => vmGet($d, 'parameter_multiselect'),
                                        'parameter_default' => vmGet($d, 'parameter_default'),
                                        'parameter_unit' => vmGet($d, 'parameter_unit')
                            );
            $db->buildQuery('UPDATE', '#__{vm}_product_type_parameter', $fields, "WHERE `product_type_id`='" . $d["product_type_id"] . "' AND `parameter_name`='" . $db->getEscaped(vmGet($d,'parameter_old_name')) . "'" );
            $db->query() ;
            
            /* Re-Order the Parameter table IF the list_order has been changed */
            if( intval( $d['list_order'] ) != intval( $d['currentpos'] ) ) {
                $dbu = new ps_DB( ) ;
                
                /* Moved UP in the list order */
                if( intval( $d['list_order'] ) < intval( $d['currentpos'] ) ) {
                    
                    $q = "SELECT product_type_id,parameter_name FROM #__{vm}_product_type_parameter WHERE " ;
                    $q .= "product_type_id=' " . $d["product_type_id"] ;
                    $q .= "' AND parameter_name <> '" . $db->getEscaped(vmGet($d,'parameter_name')) ;
                    $q .= "' AND parameter_list_order >= '" . intval( $d["list_order"] ) . "'" ;
                    $db->query( $q ) ;
                    
                    while( $db->next_record() ) {
                        $dbu->query( "UPDATE #__{vm}_product_type_parameter SET parameter_list_order=parameter_list_order+1 WHERE product_type_id='" . $db->f( "product_type_id" ) . "' AND parameter_name='" . $db->f( "parameter_name" ) . "'" ) ;
                    }
                } 
        /* Moved DOWN in the list order */
        else {
                    
                    $q = "SELECT product_type_id,parameter_name FROM #__{vm}_product_type_parameter WHERE " ;
                    $q .= "product_type_id='" . $d["product_type_id"] ;
                    $q .= "' AND parameter_name <> '" . $db->getEscaped(vmGet($d,'parameter_name')) ;
                    $q .= "' AND parameter_list_order > '" . intval( $d["currentpos"] ) ;
                    $q .= "' AND parameter_list_order <= '" . intval( $d["list_order"] ) . "'" ;
                    $db->query( $q ) ;
                    
                    while( $db->next_record() ) {
                        $dbu->query( "UPDATE #__{vm}_product_type_parameter SET parameter_list_order=parameter_list_order-1 WHERE product_type_id='" . $db->f( "product_type_id" ) . "' AND parameter_name='" . $db->f( "parameter_name" ) . "'" ) ;
                    }
                
                }
            } /* END Re-Ordering */
            
            if( $d["parameter_type"] != "B" ) { // != Break Line
                // Delete old index
                $q = "ALTER TABLE `#__{vm}_product_type_" ;
                $q .= $d["product_type_id"] . "` DROP INDEX `idx_product_type_" . $d["product_type_id"] . "_" ;
                $q .= $db->getEscaped(vmGet($d,'parameter_old_name')) . "`;" ;
                $db->setQuery( $q ) ;
                $db->query() ;
                
                // Update column in table product_type_<id>
                $q = "ALTER TABLE `#__{vm}_product_type_" ;
                $q .= $d["product_type_id"] . "` CHANGE `" ;
                $q .= $db->getEscaped(vmGet($d,'parameter_old_name')) . "` `" ;
                $q .= $db->getEscaped(vmGet($d,'parameter_name')) . "` " ;
                switch( $d["parameter_type"]) {
                    case "I" :
                        $q .= "int(11) " ;
                    break ; // Integer
                    case "T" :
                        $q .= "text " ;
                    break ; // Text
                    case "S" :
                        $q .= "varchar(255) " ;
                    break ; // Short Text
                    case "F" :
                        $q .= "float " ;
                    break ; // Float
                    case "C" :
                        $q .= "char(1) " ;
                    break ; // Char
                    case "D" :
                        $q .= "datetime " ;
                    break ; // Date & Time
                    case "A" :
                        $q .= "date " ;
                    break ; // Date
                    case "V" :
                        $q .= "varchar(255) " ;
                    break ; // Multiple Value
                    case "M" :
                        $q .= "time " ;
                    break ; // Time
                    default :
                        $q .= "varchar(255) " ; // Default type Short Text
                }
                if( $d["parameter_default"] != "" && $d["parameter_type"] != "T" ) {
                    $q .= "DEFAULT '" . $db->getEscaped(vmGet($d,'parameter_default')) . "' NOT NULL;" ;
                }
                $db->setQuery( $q ) ;
                $db->query() ;
                
                // Make index for this column
                if( $d["parameter_type"] == "T" ) {
                    $q = "ALTER TABLE `#__{vm}_product_type_" ;
                    $q .= $d["product_type_id"] . "` ADD FULLTEXT `idx_product_type_" . $d["product_type_id"] . "_" ;
                    $q .= $d["parameter_name"] . "` (`" . $db->getEscaped(vmGet($d,'parameter_name')) . "`);" ;
                    $db->setQuery( $q ) ;
                    $db->query() ;
                } else {
                    $q = "ALTER TABLE `#__{vm}_product_type_" ;
                    $q .= $d["product_type_id"] . "` ADD KEY `idx_product_type_" . $d["product_type_id"] . "_" ;
                    $q .= $db->getEscaped(vmGet($d,'parameter_name')) . "` (`" . $db->getEscaped(vmGet($d,'parameter_name')) . "`);" ;
                    $db->setQuery( $q ) ;
                    $db->query() ;
                }
            }
            return True ;
        } else {
            return False ;
        }
    }
    
    /**
     * Controller for Deleting Records.
     */
    function delete_parameter( &$d ) {
        
        if( ! $this->validate_delete_parameter( $d ) ) {
            return False ;
        }
        $record_id = $d["parameter_name"] ;
        
        if( is_array( $record_id ) ) {
            foreach( $record_id as $record ) {
                if( ! $this->delete_record( $record, $d ) )
                    return false ;
            }
            return true ;
        } else {
            return $this->delete_record( $record_id, $d ) ;
        }
    }
    /**
     * Should delete a Parameter form Product Type 
     * and drop column from table product_type_<id>
     */
    function delete_record( $record_id, &$d ) {
        $db = new ps_DB( ) ;
        
        /** Find parameter_type of deleted parameter */
        $q = "SELECT parameter_type FROM #__{vm}_product_type_parameter" ;
        $q2 = " WHERE product_type_id='" . $d["product_type_id"] . "' AND parameter_name='".$db->getEscaped($record_id)."'" ;
        $db->query( $q . $q2 ) ;
        if( $db->next_record() )
            $parameter_type = $db->f( "parameter_type" ) ; else
            $parameter_type = "B" ; // Error - dont delete (maybe nonexisted) column from #__{vm}_product_type_XX
        
 
        $q = "DELETE FROM #__{vm}_product_type_parameter" ;
        $db->setQuery( $q . $q2 ) ;
        $db->query() ;
        
        // Delete index - deleted automaticaly
        /*      $q  = "ALTER TABLE `#__{vm}_product_type_";
        $q .= $d["product_type_id"]."` DROP INDEX `idx_product_type_".$d["product_type_id"]."_";
        $q .= $d["parameter_name"]."`;";
        $db->setQuery($q);   $db->query();*/
        
        if( $parameter_type != "B" ) { // != Break Line
            // Delete column
            $q = "ALTER TABLE #__{vm}_product_type_" . $d["product_type_id"] . " DROP `".$db->getEscaped($record_id)."`" ;
            $db->setQuery( $q ) ;
            $db->query() ;
        }
        
        return True ;
    }
   

Re: EAV attribtutes

Posted: Fri Sep 19, 2008 2:15 pm
by allspiritseve
jshpro2 wrote:some kind of EAV wrapper that physically runs create and alter SQL
I guess I'm confused... wouldn't an EAV take the place of create/alter SQL? Are you trying to use some interface to an EAV system to actually create/alter tables?

Re: EAV attribtutes

Posted: Fri Sep 19, 2008 2:37 pm
by josh
Yeah, entities attributes and values would be passed to an interface, which would take care of the data storage, determining if the entity is defined, determining if its overwriting an existing attribute, etc..

the data storage layer underneath could store it as EAV in one monolithic table, but we would write an adapter to run create and alter commands... that way down the road if a better solution to the "custom fields" pattern came up, it can be swapped out just like you would with the data layer in a traditional ORM paradigm

Re: EAV attribtutes

Posted: Fri Sep 19, 2008 3:26 pm
by allspiritseve
jshpro2 wrote:Yeah, entities attributes and values would be passed to an interface, which would take care of the data storage, determining if the entity is defined, determining if its overwriting an existing attribute, etc..

the data storage layer underneath could store it as EAV in one monolithic table, but we would write an adapter to run create and alter commands... that way down the road if a better solution to the "custom fields" pattern came up, it can be swapped out just like you would with the data layer in a traditional ORM paradigm
Interesting. That was sort of my plan for my EAV as well. I was going to create a table data gateway that would essentially have the same interface as ones that worked with physical tables, but took care of all the EAV inner workings so the client wouldn't know the difference.

Re: EAV attribtutes

Posted: Fri Sep 19, 2008 3:29 pm
by josh
They'll notice the difference in the new ways they will be able to query their data, if they don't notice anything theres no reason to justify writing code. IMO.

Re: EAV attribtutes

Posted: Fri Sep 19, 2008 3:37 pm
by allspiritseve
jshpro2 wrote:They'll notice the difference in the new ways they will be able to query their data, if they don't notice anything theres no reason to justify writing code. IMO.
Well, keep in mind this was for a full-blown EAV system, not just for custom fields.