Need an OOP slap to the head...

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Need an OOP slap to the head...

Post by nielsene »

Long time posters will probably remember that I was a rather vocal OOP advocate a few years back (when I was last active here). Well as my recent unit-testing post indicates, I haven't been doing a good job "walking the walk". I need some help. I have a huge system that I need to re-architect and continually extend. I'm te only active developer and the users expect user-visible improvements -- one of the pushes to rapid/rush code.

Let me describe the current structure of the code and the current new feature I need to add, perhaps someone can see a good way to approach the new code in a good OOP manner with a little "trickle down" into the existing code without requiring a complete rewrite.

My application has many seperate "areas" (I won't call them modules because they aren't that distinct). Each module tends to have two "god classes" -- one descended from my base DB class and one descended from my base UIDisplay class. The base DB class is my lightwieght DB abstraction layer. The UIDisplay class handles the page templating, and has many helper functions for assembling forms.

Subclasses of DB add in many prepackaged DB queries->associative array generators. One of the goals is to migrate all SQL out of any script/class not in the DB hierarchy. I've also been meaning to write a PhraseBook that would be an atribute of the DB classes, thus pulling the SQL even further out.

Subclasses of the UIDisplay often take the result of a DB-subclass call and format it into the appropriate manner for display, along with wrapping it in needed interface elements. Nearly all of UIDisplay's methods return strings -- they don't build up an internal page.

The one other main object hierarchy is "Storeable Objects." This is my ORM layer. The base "StoreableObject" provides methods for registering tables/column/attribute triplets. It knows how to join tables on designated values. It can handle 1:m, 1:1, etc. It provides post,retreive,search functionality. (retrieve operates on primary key, search on non-key data). Post includes dirty field checking, etc.

The subclasses of this such as "Person", "Organization", etc simply register their table/attribute lists in their constructor and often provide their get/setters. The "Person" class also has an experiement in adding some business logic. So far it seems to be working well.

Most user visible web pages are generated in the following manner:
script creates the appropraite subclass of DB and UIDisplay. Calls the needed DB classes to generate the data and then builds up the result page by appending the results of calls to the UIDisplay classes.

Now I need to add several "export" features to allow a selected set of people to generate textual reports from the database. Typically they will select what data and which format they want and the report gets mailed to their email. Same data sets would be "List of Registered Individuals, sorted by name", "List of Registered Individuals, sorted by affiliation", "List of events with registered individuals". Data formats would be "tab seperated text","XML","LaTeX".

I can tell I'm about to simply add a new DB-subclass that would produce the appropriate data, and then new-UIDisplay subclasses for the different formatting options. Part of me things that in this case its actually correcct and not poor design as it waas other places... Another part of me is definately worried about the proliferation these "single-use" associative arrays -- ie the outputs of a given DB-subclass function call is normally only consumable by a single UIDisplay subclass function call, and requires the knowledge of the chosen keys --- bad, bad, bad.

But wrapping it into a class, yields meremly a glorified struct, stil not OOP.

Please, help get me out of my rut.
McGruff
DevNet Master
Posts: 2893
Joined: Thu Jan 30, 2003 8:26 pm
Location: Glasgow, Scotland

Re: Need an OOP slap to the head...

Post by McGruff »

I think I'd probably want to move away from inheritance hierarchies. For example: http://www.phppatterns.com/index.php/ar ... ew/25/1/1/.

Aggregation is much more flexible. Inheritance hierarchies sometimes conceal a range of responsibilites which don't really have an isA relationship. Applying the scalpel like this ought make the tests easier to write and generally bring everything into sharper focus. I'm just guessing though - I haven't seen any actual code.

I'm not up to speed on complex ORM solutions. I've never had to implement one. The area is well-researched though and Patterns of Enterprise Architecture has a lot of material on the subject.

I'm a bit sceptical about the UI classes - again without having seen them. As a rule, I'll keep a clear distinction between the responsibility of defining sets of data for display and applying formatting to the data.

For the former I'd have some kind of "page data" object (one for each view) which is a little bit more than just a container - basically a facade for the domain. It knows which domain objects it should ask for whatever data it needs - or may make calls direct to the data access layer if there's no domain logic to apply. The data-gathering is the same for all output formats, so all that's left is to apply an appropriate template where data is echo'd out in LaTeX markup etc, and any presentational code is called.

Tab-separated rows hardly need templates but since LaTeX and xml will, everything else has to follow suit.
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

Yes I know the current design is poor.

I'm hoping to get back on track via some of the TDD work I'm doing (and pestering this board with questions about in the Unit Testing forum).

I'm working on a "simple" list formatter. At this point there are five kinds of lists and four formats for each. The five kinds are "Competitor, by Number", "Competitor, by School, by Number", "Event, by Number", "Event by Name","School List". Well the first two pairs sound similar they pull different data and arrange it differently as well. The four formats are "tabbed seperated", "simple text, preformmated","LaTeX preformmated",and "XML". The TDD work has grown
a PrintedList class could handled all the "kinds" for "tabbed seperated", providing the data is magically added to the class. (I plan to tackle the database pull a little later.)

I've added the test case for "Competitor, by Number" and "simple text, preformatted" and modified to code to pass. I then refactored somewhat to remove duplication. I think the code is still "smelly" and I think its screaming for a pattern. (Code will be shown below). Currently competiting thoughts are
1) Making an inheritence hierarchy, rename the helper_competitorTextFormatter to helper_textFormatter, make it "virtual/abstract" in the base class, each "kind"of list will subclass the abstract base, and provide its own formatter (eventually 3 formatters, text,latex,xml)
2) something more Strategy like
3) something else?

Code: Select all

<?php
class PrintedList {
  var $contents;
  var $title;
  function PrintedList($t="") {
    $this->contents=array();
    $this->title=$t;
  }

  function add($item) {
    $this->contents[]=$item;
  }

  function size() {
    return count($this->contents);
  }

  function toTextString() {
    return $this->helper_toString($this->title,"toTextString",
				  "helper_competitorTextFormatter");
  }

  function toString() {
    return $this->helper_toString(strtoupper($this->title),
				  "toString",
				  "helper_tabbedTextFormatter");
  }


  function helper_toString($title,$objectRule,$arrayRule) {
    $str=$title;
    $lastSeen="";
    foreach($this->contents as $entry) {
      if ($str!="") $str.="\n";
      if (is_object($entry)) {
        $str.=$entry->$objectRule();
        $lastSeen="";
      }
      else if (is_array($entry)) {
        $str.=$this->$arrayRule($entry,$lastSeen);
      }
      else {
        $str.=$entry;
	     $lastSeen=$entry;
      }
    }
    $str.="\n";
    return $str;
  }

  function helper_competitorTextFormatter($entry,&$lastSeen) {
    $prime = "$entry[0]\t$entry[1] $entry[2] ($entry[3]) ";
    $partner = "& $entry[4] $entry[5] ($entry[6])";
    if ($lastSeen==$entry[0]) {
     $full= "\t\t$partner";
    }
    else {
     $full=$prime.$partner;
    }
    $lastSeen=$entry[0];
    return $full;
  }

  function helper_tabbedTextFormatter($entry,&$lastSeen) {
    $line="";
    foreach($entry as $subEntry) {
      if ($line!="") $line.="\t";
      $line.=$subEntry;
    }
    return $line;
  }
}
?>

I think using inheritence is the right option here, but I also know I tend to over use inheritence....

I need to pick one before I weite the next test case (which will be a second conrecte "kind" of printed list). The solution shouldn't be too influenced by the road ahead. however, just what's right for the code as-is....
McGruff
DevNet Master
Posts: 2893
Joined: Thu Jan 30, 2003 8:26 pm
Location: Glasgow, Scotland

Post by McGruff »

The big problem I see with the above is that data can arrive as string, array or object. The class has a lot of work to do in trying to disentangle all that. Different formatting is applied depending on data type - or object types are asked to apply the formatting themselves. I found it quite hard to figure out the logic.

The Strategy pattern could be a good way to deal with different formatting options - with a different base class / derived strategies for each list type rather than than one for all.

I've a strong feeling a RecordSet is going to be useful. Fowler mentions that this can become a form of UnitOfWork (I think you rejected UoW in the page navigation topic but it's the best solution I can think of so far). The big thing is to standardise the data format somehow. A RecordSet could be passed straight to one of the output options, or it could be passed through the editing wizard and commited when the user is finished.
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

The code, when I left it last night, was definitely evolvely towards a Strategy pattern when the test cases expanded the needs. It hasn't reached a full Strategy implementation as the Context and Strategy haven't been seperated yet (ie a full-fledged Strategy pattern would push the entire formatting bit to a composition-ly included class for the list.

However, that worries me a little; I've always had some misgivings about Strategy (and RecordSet). Strategy ends up being a hierarchy of pure "operation" classes, no data. RecordSets almost alwasy end up being pure data classes with get/setters. Often they have the DB pull stuff and might get rid of the explicit setters; they might have next/prev row as well. However a RecordSet seems to be just a "struct" (so its dual of a Strategy.

From a UML look, both tend to be well-below average in size and thus would normally be a prime candidate for "absorbtion" by another class as a refactoring.
McGruff
DevNet Master
Posts: 2893
Joined: Thu Jan 30, 2003 8:26 pm
Location: Glasgow, Scotland

Post by McGruff »

A RecordSet may not do much on its own but then it's not supposed to. It's only there to assist in whatever needs doing with tabular data. It would maybe be indicated if you have lots of operations to perform on tabular data. It's a little bit more than just a hash. You might have some methods to manage edits like:

Code: Select all

function deleteRowByName($first_name, $last_name)
{
}
Fowler (from PoEAA):

"Get a RecordSet from the database; pass it to a TableModule to calculate derived information; pass it to a UI for display and editing; pass it back to a TableModule for validation. Then commit the updates to a database."

That sounds very like a starting point for a solution to the "circular wizard" problem - also adding some kind of persistence, edit history, row comparison, and application controller logic into the mix.

Another option is a plain iterator. The two are pretty much the same thing. They both represent tables and hide the data source. The difference is that an iterator for a db result resource is linked to the resource but the RecordSet makes a copy of the table (and might present an iterator interface itself). An iterator would be OK for the straight-to-output scripts but it isn't possible to add/edit/delete rows as you need to do in the circular wizard.

For the output formatting, really my first choice would be no Strategy or in fact any kind of class at all. Just pass a record set - or iterator - to a template where the table is looped through and echo'd out.
Last edited by McGruff on Sat Aug 06, 2005 1:45 am, edited 1 time in total.
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

McGruff wrote:Another option is a plain iterator. The two are pretty much the same thing. They both represent tables and hide the data source. The difference is that an iterator for a db result resource is linked to the resource but the RecordSet makes a copy of the table (and might present an iterator interface itself). An iterator would be OK for the straight-to-output scripts but it isn't possible to add/edit/delete rows as you need to do in the circular wizard.

For the output formatting, really my first choice would be no Strategy or in fact any kind of class at all. Just pass a record set - or iterator - to a template where the table is looped through and echo'd out.
The iterator/template approach is sounding good to me. I definitly do not want a RecordSet that allows edit/add. These lists are purely for display at present. The data gets added/editted through very carefully crafted channels at present and I wouldn't want a seperate channel that would avoid the business logic.

I do have a QueryResult class that is nearly an RecordSet/Iterator hybrid -- it slurps up all the data from a query into internal storage and exposes a getRowAt method. It would be good to add an iterator interface to it.

I'm not sure exactly how the template idea fits though. I'll need to capture the output to a string before echo-ing as it might be mailed intead/in addition to echoing. Perhaps packaging up the various template methods into a Factory?
McGruff
DevNet Master
Posts: 2893
Joined: Thu Jan 30, 2003 8:26 pm
Location: Glasgow, Scotland

Post by McGruff »

I tend to think in terms of "Cartesian boundaries". There's an internal "application mind" which doesn't know or care about the outside world - ie what kind of presentational layer has been slapped on top. Boundaries are places where inputs from external systems are translated into forms the application can understand (such as a Request object encapsulating input) or where data created by the app is formatted for some kind of output.

For output, the key, I think, is to have some kind of well-defined jumping-off point where all the dynamic data has been gathered together. This step is common to all. From here it should be easy to send the data off to email, browser, CLI, write to file or whatever.

Browser output can be as simple as including a template in the same scope as the data. Templates would have some simple presentational code for loops or conditionals. The same, with buffering, could capture the output to a string in order to write to file (I imagine that's what you'll be doing with LaTeX). I'd probably be using an email class to send email; method calls could just be passed the raw data. Templates & buffering / heredoc might or might not be useful to create email parts.
User avatar
nielsene
DevNet Resident
Posts: 1834
Joined: Fri Aug 16, 2002 8:57 am
Location: Watertown, MA

Post by nielsene »

I'm still not seeing it...

My TDD led design would be described as:

Code: Select all

PrintedList
-----------
load (abstract)
add
toTextString (abstract)
toString
toLaTeXString (abstract)
toXMLString (abstract)

CompetitorList extends Printed List
----------------------------------
implements the abstract methods

repeat for the four other list types.
Obviously this is bad... its what grew out of the successive tests, so either I was picking the wrong things to test or....

I would expect a better design (but I shouldn't be thinking design) would be something like:

Code: Select all

ListPhraseBook
-------------
packages up the static queries that generate the data for the lists

ListFormatterInterface
---------------------
formatAsXML
formatAsLaTeX
formatAsText
formatAsTabbed

subclass with the needed formatting logic
Whereas the earlier one would have been used as

Code: Select all

$alist= new CompetitorList();
$alist->load($db);
$output=$alist->toTextString();
the "better" version would be

Code: Select all

$formatter= new CompetitorListFormatter();
$output = $formatter->formatAsText($db->query($pb->getQuery("Competitor List")));
db->query returns a QueryResult (basically a RecordSet).
Post Reply