Refactoring Inheritance to Composition

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

georgeoc
Forum Contributor
Posts: 166
Joined: Wed Aug 09, 2006 4:21 pm
Location: London, UK

Refactoring Inheritance to Composition

Post by georgeoc »

Hi all,

Before my copy of Martin Fowler's Refactoring: Improving the Design of Existing Code arrives next week, I would love to get some advice for the application I'm working on.

I have been getting more and more despondent with the development of my large-scale application, as I realise that my approach to OOP leaves quite a lot to be desired. It is my first real experience in both OOP and TDD, and although I feel I'm doing quite well (I have a smooth-running alpha release at least), I am now starting to see many areas which need improvement.

The major thing I've realised is that I've gone overboard with Inheritance, and made very little use of the power of Composition. I have read many similar stories, and don't particularly want to debate the relative merits in this thread. I can identify a number of cases where Composition seems more logical. I'm sure the Fowler book will help me get started, but I'd like some advice on how to choose between the two options, and how to rescue my application design (whose inheritance tree is extensive and highly coupled at the moment).

For your information, I'm currently developing for PHP 4 & 5 in one package, although I'm considering dropping PHP 4 support which will make my refactoring much easier and logical for all the OOP improvements of PHP 5.

First, some symptoms I've noticed:
  • I'm finding classes are becomming harder and harder to write tests for
  • Important functionality often happens in private methods of parent objects, so is not testable (I currently try to suggest pseudo-private methods and properties in PHP 4 using an underscore prefix)
  • Child objects are becomming bloated with unnecessary properties and functionality - even a simple Message object with subject, body and author field is very large because of all the unnecessary features it inherits.
  • Although I hate the idea, I've come close to having a super-object to make ORM easier - every object that may need to be saved in the database descends from a class designed to be manipulated into a table row. This was an early mistake, and I plan to solve it with a simple Mapper which analyses an object and creates a database query.
Although I've identified general areas where I can do with less Inheritance, I need some help with how to start the process.

Let me give an example. The application is a modular message centre. We use a proprietary message format internally (known as a Generic Message, or GM); the plugin modules are responsible for converting messages to and from that format. For example, it's possible to set up a chain between a POP3 import and a phpBB 3 export. The chain instructs the POP3 channel to import messages (from a mailbox) and convert them to GM format, then passes the GM objects to the phpBB channel which converts them into phpBB post format and exports them to a phpBB board.

The POP3 channel has this class name: m2f_channel_import_fetch_email_pop. This follows my current naming convention, and tells us that it inherits from:
  • m2f_channel_import_fetch_email (a generic class to deal with imported email from any source - POP, mbox, IMAP, stdin),
  • m2f_channel_import_fetch (an import channel which fetches messages on command, rather than listening for trigger messages),
  • m2f_channel_import (a version of the channel class designed to import messages)
  • m2f_channel (base channel class)
But, horror of horrors, the m2f_channel class also descends from:
  • m2f_element (an object designed to be part of a chain - other possibilities include filters and routers to transform messages)
  • ... which descends from m2f_module (an object designed to be modular - drop files into the modules directory and they can be used instantly)
  • ... which descends from m2f_object (the aforementioned object designed to be manipulated for entry into a database)
Now you see why I'm keen to improve on the structure of the code?!!!

So if we take the example of the m2f_channel_import_fetch_email_pop class. It does two distinct things - it must log in to a POP3 mailbox and retrieve a number of messages (fairly obvious). But then it needs to convert those messages into GM format. The POP3 functionality obviously belongs only in that class, but converting emails to GM needs to be at a lower level. Currently, when a POP object has retrieved some emails, it runs:

Code: Select all

$this->_transform($messages)
which runs the private method of the parent class, m2f_channel_import_fetch_email. But this makes the _transform() method difficult to test. Much better would be to have a separate object that is responsible for converting emails into GM, and have that object instantiated by the email channel, or passed in as a constructor parameter.

Having decided that, however, I'm now not sure how much inheritance I need at all. Do channels need to be arranged in a tree of functionality like they are currently (another example is m2f_channel_export_forum_phpbb3), or instead would it be better to have a simple 'channel' object with functionality like 'POP3' passed in to it via composition?

Sorry for the very long-winded post. It may make things easier if you scan over the files in my SVN repository: http://m2f.svn.sourceforge.net/viewvc/m2f/m2f/trunk/ - have a look at the /modules/channel directory to see what I've been talking about above.

I'm basically looking for general advice on how to spot unnecessary and wasteful Inheritance, and begin the job of using Composition to achieve the same ends. And more specific advice on whether I need a family-tree of classes for my channel modules, or whether a different approach may be better.

Thank you for any advice you can give!

(P.S. - I'm patiently waiting for the Search to be reindexed on the new phpBB 3 forum so I can do some more of my own research!)
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Refactoring Inheritance to Composition

Post by Christopher »

That is a lot of information an a broad question. Two questions come to mind:

1. In your deep hierarchy of class inheritance, which classes actually are separate for true DRY reasons and which are separate for only conceptual reasons?

2. What are the dependencies of this system? That is probably the most important thing and you left out that bit.
(#10850)
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Re: Refactoring Inheritance to Composition

Post by Kieran Huggins »

Wow - that was quite the post!

Ok - so here's what I think: You're inheriting waaaaay too much.

I would create a base message class with all the common stuff and database storage-ness, then extend it per message type. Parsing and generating that format would be class methods. Things like POP3 mail would be either an object variable or a separate class type, depending on how much the code changes.
georgeoc
Forum Contributor
Posts: 166
Joined: Wed Aug 09, 2006 4:21 pm
Location: London, UK

Re: Refactoring Inheritance to Composition

Post by georgeoc »

Thanks for your reply.
arborint wrote:1. In your deep hierarchy of class inheritance, which classes actually are separate for true DRY reasons and which are separate for only conceptual reasons?
That's made me think, and I'm finding it hard to answer. However, I think your implication is correct - some classes are separated primarily so I get a nice and tidy filesystem, easily understood class names, etc. Some parent classes give very little added functionality to their children. And the entire functionality of some of the classes could probably be performed by other composite objects.
arborint wrote:2. What are the dependencies of this system? That is probably the most important thing and you left out that bit.
Because I'm lacking in experience (in discussing my code, rather than just in my coding itself), I find it very hard to take a step back and make observations like this. Could you explain what you mean by dependencies in this case?
georgeoc
Forum Contributor
Posts: 166
Joined: Wed Aug 09, 2006 4:21 pm
Location: London, UK

Re: Refactoring Inheritance to Composition

Post by georgeoc »

Kieran Huggins wrote:I would create a base message class with all the common stuff and database storage-ness, then extend it per message type. Parsing and generating that format would be class methods. Things like POP3 mail would be either an object variable or a separate class type, depending on how much the code changes.
Thanks for the post.

I need to provide a bit more context to answer your suggestion. We only have one message type - the Generic Message. Each message is an object. Import channels produce GM objects, export channels use GM objects to make their own messages and send them.

Another point is that the code is designed to be very modular. At the moment I have a few channels - phpBB 2 and phpBB 3 import and export, POP and mbox import, SMTP and Sendmail export. In the future we expect many more channels to cater for different forum types (vBulletin, SMF, etc), RSS, stdin, NNTP - you get the idea. Each channel will be based on a public API, and may use external dependencies like PEAR (as I do already with the POP3 class) or Swiftmailer (for SMTP and Sendmail).

It is the channels that are stored in the database (POP login details, etc) so that they can be run again in the future. The messages don't need to be stored yet (they will in the future for things like moderation).

Having said all that, can you elaborate your suggestion at all? It's just that I don't think the Message class needs extending at all as far as I can see. Correct me if I'm wrong.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Refactoring Inheritance to Composition

Post by Christopher »

georgeoc wrote: That's made me think, and I'm finding it hard to answer. However, I think your implication is correct - some classes are separated primarily so I get a nice and tidy filesystem, easily understood class names, etc. Some parent classes give very little added functionality to their children. And the entire functionality of some of the classes could probably be performed by other composite objects.

Because I'm lacking in experience (in discussing my code, rather than just in my coding itself), I find it very hard to take a step back and make observations like this. Could you explain what you mean by dependencies in this case?
The actual dependencies with probably clear up most of the confusion for you. A dependency simply means that one class needs another class to work. Perhaps you could list for us all the classes you show above, and below each list every class method defined outside the class that it uses internally. That should give us a feel for what is going on.
(#10850)
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Re: Refactoring Inheritance to Composition

Post by Kieran Huggins »

ok, so extend the GM class for each "translation type" and use that to import / export:

Code: Select all

class GenMessage{
   var $messageInfo; // an array of message info
 
   public static function new($message_obj=null){
      // ... read in the message info of $message_obj 
      return self;
   }
}
 
class PHPBB3_message extends GenMessage{
   public static function read($url){
      // ... parse a PHPBB3 post
      // store in the format defined by GenMessage
      return self;
   }
 
   public function write($url){
      // read the format defined by GenMessage
      // ... write a PHPBB3 post
      return self;
   }   
}
 
// then you can do something like:
 
$in_uri = 'http://forums.devnetwork.net/viewtopic.php?p=436527';
$out_uri = 'http://forums.devnetwork.net/viewtopic.php?f=30&t=77491';
 
PHPBB3_message::new(PHPBB3_message::read($in_uri))->write($out_uri);
Of course, you could theoretically use any class to read or write, as long as it inherited from GenMessage. I was just to lazy to make another example class.

Does that make any sense?
georgeoc
Forum Contributor
Posts: 166
Joined: Wed Aug 09, 2006 4:21 pm
Location: London, UK

Re: Refactoring Inheritance to Composition

Post by georgeoc »

arborint wrote:The actual dependencies with probably clear up most of the confusion for you. A dependency simply means that one class needs another class to work. Perhaps you could list for us all the classes you show above, and below each list every class method defined outside the class that it uses internally. That should give us a feel for what is going on.
OK, here goes.

The channel classes have an import() and/or an export() public method. import() is expected to return an array of Generic Message objects (could be empty), and export() accepts a similar array as its only parameter.

In terms of dependencies, there aren't many. Simple classes have none. I wrote two channels for testing purposes only - m2f_channel_import_fetch_randomtextgen and m2f_channel_export_filewriter - and neither need external objects to work with. However, they use some features of parent classes, such as getters and setters, and functionality in the m2f_channel class to create or detect a message ID.

More complex classes create their own helper objects to do the bulk of the work. So m2f_channel_export_forum_phpbb2 creates a specilist helper class and passes a GM to it - this class gets the necessary info from the message (subject, body, author, etc) and does lots of convoluted work to load the phpBB files into memory and perform a phpBB post submission. This, I guess, is good use of Composition. However in terms of inheritance, the phpBB export class makes no more use of its parent classes than the simpler channels.

My copy of 'Refactoring' [Fowler] arrived today, and I have spent most of the day reading it. What an amazing book. For someone such as myself who has no experience of Computer Science outside the world of HTML, CSS, Javascript and PHP, it's amazing that such a book can be both so informative and readable!

My gut instinct at the moment is that I need to loose a lot of the inheritance in my system, but not all of it. The channels are by nature related and share more and more code as you go down to the tree root. If I start with an abstract m2f_channel class and build up a similar hierarchy to the one I currently have, it makes sense doesn't it?

Is there a better way to organise these partly-related channels:

Code: Select all

phpBB 2
phpBB 3
POP
SMTP
Sendmail
File Writer
Random Text Generator
than this:

Code: Select all

 
channel
-> File Writer
-> Random Text Generator
-> forum
   -> phpBB
      -> phpBB 2
      -> phpBB 3
-> email
   -> POP
   -> SMTP
   -> Sendmail
 
or perhaps this:

Code: Select all

 
channel
-> import
   -> Random Text Generator
   -> email
      -> POP
      -> IMAP
   -> forum
      -> phpBB
         -> phpBB 2
         -> phpBB 3
-> export
   -> File Writer
   -> email
      -> SMTP
      -> Sendmail
   -> forum
      -> phpBB
         -> phpBB 2
         -> phpBB 3
 
?

One thing's for certain - I need to refactor out everything above the m2f_channel parent in the current tree. Rather than have every channel object descend from a helper class that perform database persistence operations, I just need to pass channels to a stand-alone mapper object which can do the necessary analysis and interface with the db layer.
georgeoc
Forum Contributor
Posts: 166
Joined: Wed Aug 09, 2006 4:21 pm
Location: London, UK

Re: Refactoring Inheritance to Composition

Post by georgeoc »

Kieran Huggins wrote:ok, so extend the GM class for each "translation type" and use that to import / export:

<snip>

Of course, you could theoretically use any class to read or write, as long as it inherited from GenMessage. I was just to lazy to make another example class.
I understand the example, but can't get my head round whether it's useful to me or not. I've always thought on a fundamental level that an import channel will produce an array of GMs, and an export channel will accept an array as a param. So, POP emails are converted into messages, and messages are converted into phpBB posts.

On a higher level, I have a class called m2f_chain. Chains have an array of 'element' objects (channels, filters and/or routers). When the chain is run (by a cron job or trigger such as a phpBB post being made), each of its elements are run in turn. Import channels at the beginning of the list will deliver the GMs they create to the chain, which passes its store of messages down through each element. So filters later in the chain will modify each GM they receive based on defined rules, routers will set other chains in motion, and export channels towards the end will take the GMs and send them out into the big wide world.

So to look at your example code, I can't understand how a class GenMessage can be expected to create or export messages itself. I always imagined a GM as basically a simple Data Object (perhaps with a containing GenMessageCollection for iteration purposes, etc), which was generated by a Channel and passed around the system by a Chain.
georgeoc
Forum Contributor
Posts: 166
Joined: Wed Aug 09, 2006 4:21 pm
Location: London, UK

Re: Refactoring Inheritance to Composition

Post by georgeoc »

Can I just point out that I find it very hard to be concise when I talk about code. You may have noticed. It's because I have only ever worked alone in my PHP development. Oh, how I long for a like-minded colleague to work with on this project - the concept of Pair Programming seems like a dream to me!

I have so much respect for the members of this forum. Although I'm pretty quiet, I visit devnet every single day that I use my computer (that is, when my work as a freelance violinist dries up - January is a particularly bad month!) and have learnt a huge amount from the careful, considerate and intelligent contributions you guys make.

I guess I can help improve my understanding and descriptions of code design if I contribute more to discussions here. I will try harder in future!
alex.barylski
DevNet Evangelist
Posts: 6267
Joined: Tue Dec 21, 2004 5:00 pm
Location: Winnipeg

Re: Refactoring Inheritance to Composition

Post by alex.barylski »

georgeoc wrote:the concept of Pair Programming seems like a dream to me!
Thats about the only practice in XP that I don't really see being effective...but obviously it works for some people. Perhaps during the first few months of bringing someone onto a project it would be a good way to introduce the developer to the source code, architecture, etc...

While I agree two heads are better than one...I also see it as a waste of time. Bugs slip by even when millions of people look at source code, so I"m sure that two developers would not be more effective in catching errors. Unless you had one lesser experience developer paired with a experienced developer who assisted in architectural decisions. In which case, this brings me back to my original thought, that pair programming would be handy in training scenarios. Having two equally talented developers solve one problem though, seems like a good waste of time as they get carried in theoretical disscussion and accomplish nothing. :P

Personally, I think I'm smart enough to know when something doesn't feel right...at which point I sit back and think about the problem and usually ask around on several forums I frequent...this I find most productive as I then get a non-biased third party opinion that isn't tainted by internal forces like budget, etc...
I guess I can help improve my understanding and descriptions of code design if I contribute more to discussions here. I will try harder in future!
A forum is pretty handy that way...although not as 'immediate' as I would like sometimes...it is an excellent way to learn new methods, practices, etc...
User avatar
Weirdan
Moderator
Posts: 5978
Joined: Mon Nov 03, 2003 6:13 pm
Location: Odessa, Ukraine

Re: Refactoring Inheritance to Composition

Post by Weirdan »

Hockey wrote:Having two equally talented developers solve one problem though, seems like a good waste of time as they get carried in theoretical disscussion and accomplish nothing. :P
To me pair it seems programming helps to battle laziness and temptation to do something unrelated to work (or semi-related, like reading email / forums). Theoretically it should help to avoid mind blocks as well.
georgeoc
Forum Contributor
Posts: 166
Joined: Wed Aug 09, 2006 4:21 pm
Location: London, UK

Re: Refactoring Inheritance to Composition

Post by georgeoc »

The thread's gone a little off topic, but are there any more comments? Kieran or arborint?
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Re: Refactoring Inheritance to Composition

Post by Christopher »

It seems like the importers should inherit the interface and common methods from an Import class, and the exporters should inherit the interface and common methods from an Export class. The converter then has a dependency on one of each to do it's job. For the common things is sounds pretty easy. It is how to express the unique capabilities of an importer/exporter that will be challenging.
(#10850)
User avatar
Kieran Huggins
DevNet Master
Posts: 3635
Joined: Wed Dec 06, 2006 4:14 pm
Location: Toronto, Canada
Contact:

Re: Refactoring Inheritance to Composition

Post by Kieran Huggins »

I think you're thinking about the procedure too much. In OOP it's all about objects, their properties & their methods.

It sounds to me like you just have a message object that imports and exports. Each flavour has a slightly different procedure for these methods, but they share a common property set. Is there more to it than that? I find it helps to refactor it down to the basics.
Post Reply