Page 1 of 2

I need to refactor my unit tests...

Posted: Sun May 20, 2007 8:38 pm
by Ambush Commander
Quite honestly, I think they've developed a life of their own. Tests are splattered everywhere with some semblance of organization, but not really. Errors in one component result in cascading error rushes in others (tightly coupled components coming back to bite me in the can). Single test cases can run up to a 100 lines long. Running the test suite takes the better part of twenty seconds. Maintaining the tests takes longer than writing actual code.

This will not do. I must refactor my unit tests.

Unfortunately, I have no clue where to start. >.> Any help?

Posted: Mon May 21, 2007 1:02 am
by Chris Corbyn
I'm watching this thread. My tests have also gradually become unmanagable. I keep working on them in bits and pieces creating abstract classes to provide common functionality between multiple tests but I'm still feeling a bit lost now. At least for me (and I suspect the same for you) the sheer amount of regexp which need to be reworked is becoming tedious. Also, expectations I've made in Mock objects are changing as features are added which is causing cascades of errors when I deliberately change something to statisfy a new requirement.

I know this probably reads as if I've tested implementation details rather than interface but there's not really a better way to test interactions with external sources without explicitly depending upon that source -- which of course may not be working correctly itself and thus give unpredictable test results.

It's actually becoming really frustrating, to the point that Unit testing feels like it's more hassle than its worth. I'm sure as the code base matures the tests will sit there happily unchanged, but right now it feels as though I've written my tests very badly.

Posted: Mon May 21, 2007 3:44 am
by Maugrim_The_Reaper
It's sounds like the tight coupling is just complicating the test cases. The way I see it there are a few possibilities (you can guess they're vague in the absence of code to review ;)):

1. Your test cases spend a lot of time compensating for classes not being test-friendly
2. Your test cases are poorly designed and are actually testing >1 class (almost the opposite of 1)
3. You have tons of test data being created in test cases

I suppose the first question is identifying the problem. Are the tests screwed up, or do they just look screwed up? Sometimes tests are simply compensating for badly designed classes by doing lots of extra work. If you can narrow down which pieces of the code are essential (the assertions!) and which are dead-weight (compensations) you can at least separate them - a bit. I'd just stick comment blocks before and after each section of code we *shouldn't* need.

But the real problem may be the class being tested. The tests are just a decoy - to fix the tests, you need to fix the classes. If the tests are obviously broken (not testing behaviour of ONE class, or would require too much editing) then you might as well consider them all suspect and write clean tests from scratch. TDD your way out of the box using the older tests to find the exact behaviour a properly decoupled and refactored class should have. Of course if this is scarce comfort you might want to pull back and assess the entire the class system first - i.e. cover the coupled system with an Integration Test. Now as you TDD the refactoring, you can measure success against the Integration Tests (which don't give two squats about coupling since they exercise a whole set of related classes regardless of their coupling woes) and make sure everything still works as it should from a high level.

For pt. 3? Rarely applies - if it does then automate. Figure out a standard set of tasks for adding test data and fall back on using configuration files to drive that data, and automation to generate any other additional material needed (e.g. in testing a webservice input I would have the data in plain text, with a Factory class generating valid XML, JSON, serialised PHP). This reduces a whole ton of cut'n'paste duplication by centralising test data. Slows the tests a little, but it saves a lot of annoying editing across a dozen files...

Posted: Mon May 21, 2007 6:36 pm
by Ambush Commander
Trust me, you do not want to look at the code. :-D (BTW, I'm talking about HTML Purifier, and d11wtq's probably talking about Swift Mailer).

d11twq, I feel your pain on the mock objects that are quite tedious to setup/reconfigure and the piece-meal growth of the unit tests, although I don't use regexps in my tests so that's not really a problem for me.
Your test cases spend a lot of time compensating for classes not being test-friendly
I get this feeling quite a bit. I often end up writing test harnesses and custom assertions to simplify the test cases themselves, and the development of them is quite ad hoc. The difficulty of testing the classes, however, stems from two unusual conditions:

1. In the spirit of loosely coupled classes, most objects require a Configuration object and a Context object to be passed to them via parameters. It is a pain to instantiate all of these manually, so tests help me out. Which means lots of abstract test code.

2. Writing the input and output arrays of objects takes a dastardly long amount of time. The core data type the library operates on is not strings but arrays of "Token" objects. It is extremely verbose to instantiate each one of these one by one, so I end up using a more compact syntax (i.e. HTML) which needs to be parsed into that form. That parsing = test code!
Your test cases are poorly designed and are actually testing >1 class
Integration tests, you mean? :-) I think part of my trouble is that I've always treated integration tests as regular unit tests (there is no way of distinguishing them in my code: it's all organized by what interface is being immediately tested). Integration tests are useful, so I don't want to ax them, but they probably should be reorganized and cordoned off into their own area. Or maybe not... thoughts on this?
If you can narrow down which pieces of the code are essential (the assertions!) and which are dead-weight (compensations) you can at least separate them - a bit. I'd just stick comment blocks before and after each section of code we *shouldn't* need.
Sounds like a good idea. Does the setup of mock objects count as dead-weight? (I really don't thinks so, but they do take up a lot of space).

Finally, on decoupling: I am increasingly tempted to tighten, not loosen, coupling, for performance reasons. For example, I have four separate classes that do separate jobs but could be parallelized at the cost of readability but for a not insignificant performance gain. It's tough...

Posted: Tue May 22, 2007 3:27 am
by Maugrim_The_Reaper
Integration tests, you mean? Smile I think part of my trouble is that I've always treated integration tests as regular unit tests (there is no way of distinguishing them in my code: it's all organized by what interface is being immediately tested). Integration tests are useful, so I don't want to ax them, but they probably should be reorganized and cordoned off into their own area. Or maybe not... thoughts on this?
Not necessarily, I've seen lots of tests which end up testing >1 class but which aren't really meant as dedicated Acceptance Tests. Usually it's from a lack of Mock objects, so I'm referring strictly to "Unit Testing". Integration Testing and Unit Testing are related, but not the same thing. If someone has a Unit Test testing >1 classes, then it's actually an Integration Test - which means they have no valid Unit Test for that class. Which by itself should be worrying.

As for distinguishing the two? I try to. I like to keep Unit, Integration and Acceptance Tests distinct from each other so I can run each separately. In all fairness though so long as an Integration test is clearly names such (some name convention) there's little reason to split them out separate from Unit Tests. Running both at the same time is likely more convenient anyway.
Sounds like a good idea. Does the setup of mock objects count as dead-weight? (I really don't thinks so, but they do take up a lot of space).
Depends. If it's repetitive, duplicative, etc. then it's a candidate for refactoring. Since tests aren't hierarchical I might sometimes use an external (and yes, definitely tested or it'll screw up a lot of stuff!) class for creating Mocks with a simplified interface. Maybe a Factory or just something with a lighter API to cut down the amount of typing. Doesn't always work out as worthwhile if there aren't lots of near identical Mocks floating around, so sometimes it's just easier to soldier on.
Finally, on decoupling: I am increasingly tempted to tighten, not loosen, coupling, for performance reasons. For example, I have four separate classes that do separate jobs but could be parallelized at the cost of readability but for a not insignificant performance gain. It's tough...
I'd avoid coupling until you knew exactly which methods, on which objects are the problem. Then extract a slice of that into a new class (assumed all-in-one class is your parallel?) if that helps. Tighter coupling isn't ideal, but it's definitely the only way towards a marked performance increase then it may well be worth it - just be sure you have that functional unit under a set of Integration Tests in case the Unit Tests end up being less than ideal.

Posted: Tue May 22, 2007 4:04 am
by Christopher
Ambush Commander wrote: 1. In the spirit of loosely coupled classes, most objects require a Configuration object and a Context object to be passed to them via parameters. It is a pain to instantiate all of these manually, so tests help me out. Which means lots of abstract test code.

2. Writing the input and output arrays of objects takes a dastardly long amount of time. The core data type the library operates on is not strings but arrays of "Token" objects. It is extremely verbose to instantiate each one of these one by one, so I end up using a more compact syntax (i.e. HTML) which needs to be parsed into that form. That parsing = test code!
Just glanced at this thread for the first time and for some reason these two stood out to me. Please ignore me if you are actually moving toward a solution. I think my initial thought was that we either need to simplify our tests or automate somehow.

The thought that occurred to me when reading the above two points and from my own experience is how damn identical and boring it is stuffing parameters into methods and then checking if the canary died. It's late here so I may be rambling, but it seems like a better way would be to use text/XML files to contain all this information and run it through a testing framework that reads the data and runs the tests.

OR (crazy idea alert) better still, how about a declarative language to run the tests. I am thinking about SQL here:

Code: Select all

TEST OBJECT MyClass(1, 2, 3) WHERE find(42) EXPECT return>0 AND return<100;
Or how about dreaming a little:

Code: Select all

CREATE MOCK OBJECT Db (
     connect EXPECT (array) RETURN TRUE,
     query EXPECT (string) RETURN DbResult,
);
Either way you would need a testing framework that sat on top of SimpleTest (or PHPUnit ... or both!) that chewed whatever input into some internal representation and then ran the tests. And the framework would hopefully infer setup and teardown from the data and deal with those too...

Back to you guys.

Posted: Tue May 22, 2007 7:35 am
by Maugrim_The_Reaper
You can automate Mock generation easily enough - the API is stable and predictable in SimpleTest, there are few variations in method usage to worry about, and if a lot of extraneous code in tests is all about generating mocks then I'd say give it a shot. The only thing about automation is where to put all the input for creating Mocks. Preferably alongside the test tree where it's within easy reach (esp. since unit tests are documentation :)) for reading.

If the Mocks are really duplicative (e.g. using Mocks to test a class's boundary conditions or edge cases by using various reems of data) then automation is likely worth a trial.

Code: Select all

TEST OBJECT MyClass(1, 2, 3) WHERE find(42) EXPECT return>0 AND return<100;
To be honest I'd still prefer XML to configure a Mock. ;) Now if you come back with a parser for your declarative language (STL: Simple Test Language?) I might reconsider.
The thought that occurred to me when reading the above two points and from my own experience is how damn identical and boring it is stuffing parameters into methods and then checking if the canary died. It's late here so I may be rambling, but it seems like a better way would be to use text/XML files to contain all this information and run it through a testing framework that reads the data and runs the tests.
I wouldn't automate the entire test run (outside the usual SimpleTest/PUT level). I'd still want hard-coded tests with method names for each boundary case and such - still need something to read. Maybe the argument could be made that the config files are themselves documenting the boundaries - maybe.

I'd just have any continually duplicated Mocks auto-generated from config files (and even that Factory call would be inside the test case) to cut down on the typing. In my view the more typing a test case requires the more work it takes for the curious reader to decipher what's going on. In a sense a long drawn out test case is as bad as long drawn out documentation - it gets boring too quickly :).

Posted: Tue May 22, 2007 9:47 am
by Jenk
Sounds like you guys would like FIT (Framework for Integrated Tests) which can be found here: http://fit.c2.com/ (php port here: http://developer.berlios.de/projects/phpfit/)

Still alpha though.

As a general rule of thumb, if your tests are complicating things, you're doing something wrong with your objects :)

Are you using TDD or retrospective Unit Testing? (for both new classes/objects and refactoring?) If you are wishing to employ true TDD, your first point of call when changing is *always* the test case, you change this to meet the required (expected at this point) behaviour, then 'fix' your objects to pass it. You also do not need to run a test suite every time you wish to test.. make sure your current object works by testing that, and that alone, then once you get the green light, run the whole suite, then run an integration test (latter necessary to of course ensure your object interfaces coincide.. having a load of single units working is fine, but if they don't understand each other yada yada.)

Posted: Tue May 22, 2007 10:15 am
by Maugrim_The_Reaper
FIT sounds a lot like Acceptance Testing...;).

It's not quite the same thing though. I can see where it's a integrated approach but the customer input is more similar to how acceptance tests are defined. In our case it's more a Unit Test problem - assuming even a high rate of coupling any Integrated Tests are less likely to need change since they're at a much higher level where coupling has a minimal impact (i.e. who cares which way the pieces are bolted together so long as the whole works).

I would still say Mocks are the biggest trouble. If generating them is automated as much as possible it can help alleviate all sorts of problems. Going back to d11wtq's woes - not sure if much can help can there. Once you add in an external dependency it has a similar effect to tight coupling - complex tests - because of that reliance.

Would centralising regex strings help? If they're all very specific and unique I doubt there a lot you can do except push them into a single definition file perhaps so they're more easily found and edited (or reused if that's a problem).

Posted: Tue May 22, 2007 10:17 am
by Jenk
It's integration testing, as it clearly states in the name...

Posted: Tue May 22, 2007 11:36 am
by Christopher
Maugrim_The_Reaper wrote:Now if you come back with a parser for your declarative language (STL: Simple Test Language?) I might reconsider.
Anyone know of a SQL parser done in PHP? I did a quick Google and didn't find anything useful.

Posted: Tue May 22, 2007 11:42 am
by Maugrim_The_Reaper
Perl's SQL parser is likely as close as you'll get. Shouldn't be too difficult to port it to PHP5 if pushed.

Posted: Tue May 22, 2007 12:32 pm
by Christopher
Maugrim_The_Reaper wrote:Shouldn't be too difficult to port it to PHP5 if pushed.
I was more leaning and hoping I could just fall over. ;)

Posted: Tue May 22, 2007 5:41 pm
by Ambush Commander
Chewed on your comments for a little bit. Here we go...
dedicated Acceptance Tests
Alright... what are acceptance tests? (fails to see the distinction between them and integration tests).
no valid Unit Test for that class. Which by itself should be worrying
I think this drives the point home.
just be sure you have that functional unit under a set of Integration Tests in case the Unit Tests end up being less than ideal.
I guess I should take this to mean at a certain point, one should scrap their unit tests if performance reasons don't allow for the efficient testing of the unit?
it seems like a better way would be to use text/XML files to contain all this information and run it through a testing framework that reads the data and runs the tests.
I sometimes get this feeling, although I find it difficult to imagine a format that is flexible to get all of our testing needs (or even most of them). Not so savvy about SQL: most of the time, we try to abstract away into not writing it (heh heh).
In a sense a long drawn out test case is as bad as long drawn out documentation - it gets boring too quickly
I try to make my unit tests interesting. Case in point:

Code: Select all

function testRedefinition_define() {
        CS::defineNamespace('Cat', 'Belongs to Schrodinger.');
        
        CS::define('Cat', 'Dead', false, 'bool', $d1 = 'Well, is it?'); $l1 = __LINE__;
        CS::define('Cat', 'Dead', false, 'bool', $d2 = 'It is difficult to say.'); $l2 = __LINE__;
        
        $this->assertIdentical($this->our_copy->defaults['Cat']['Dead'], false);
        $this->assertIdentical($this->our_copy->info['Cat']['Dead'], 
            new HTMLPurifier_ConfigDef_Directive('bool',
                array($this->file => array($l1 => $d1, $l2 => $d2))
            )
        );
        
        $this->expectError('Inconsistent default or type, cannot redefine');
        CS::define('Cat', 'Dead', true, 'bool', 'Quantum mechanics does not know.');
        
        $this->expectError('Inconsistent default or type, cannot redefine');
        CS::define('Cat', 'Dead', 'maybe', 'string', 'Perhaps if we look we will know.');
    }
Are you using TDD or retrospective Unit Testing?
It's interesting that you should mention that. Most of the time I do TDD. Sometimes, however, I get a really good idea and jump straight to implementing it, because making a unit test for the idea would be too difficult to do.

In fact, I can see that the classes I did not TDD are the ones having the most problems right now! :-)
In our case it's more a Unit Test problem - assuming even a high rate of coupling any Integrated Tests are less likely to need change since they're at a much higher level where coupling has a minimal impact (i.e. who cares which way the pieces are bolted together so long as the whole works).
But if the behavior of the whole changes, you'll end up changing integration tests and unit tests, quite possibly redundantly. Is redundancy bad in unit testing? On one hand, you don't want cascading errors. On the other hand, you shouldn't remove a test case without good reason.

Moving along...

I think we've established some important symptoms of naughty unit tests and naughty code. In a perfect world, code would be simple and tests would be simple, but this is not the case. Pushing redesigning the API of your actual code out of the picture of the moment, I would like to pose the question of how to refactor duplicated/redundant code in unit tests.

Case-study: my original method of dealing with duplicated code was to... put it in a loop. I would create arrays of "input" and "output", and then a special loop that would due all the initialization necessary and perform the test. The first practical problem: all the failed assertions would come out to the same line, so there was no way of finding out which test failed! I "fixed this" by sending fail messages to the assertions that included the necessary machinery.

Some of this code is still hanging around, but in retrospect, it seems like it was a bad idea. I've since migrated to a new way of handling duplicated initialization: custom assertions. You'll find my unit tests littered with things like:

Code: Select all

function test() {
        
        // valid ID names
        $this->assertDef('alpha');
        $this->assertDef('al_ha');
        $this->assertDef('a0-:.');
        $this->assertDef('a');
        
        // invalid ID names
        $this->assertDef('<asa', false);
        $this->assertDef('0123', false);
        $this->assertDef('.asa', false);
        
        // test duplicate detection
        $this->assertDef('once');
        $this->assertDef('once', false);
        
        // valid once whitespace stripped, but needs to be amended
        $this->assertDef(' whee ', 'whee');
        
    }
Totally unhelpful to someone who wants to figure out how to invoke the object being tested, but quite useful in determining the behavior of the class. (Tidbit: the assert function's implementation is *10* lines long. Only 1 function call is being tested. The rest is assertions or setup. I have another harness that's 60 lines long, once again, only one function call is being tested.)

Any wisdom in this area?

Posted: Tue May 22, 2007 6:59 pm
by Ollie Saunders
It's interesting that you should mention that. Most of the time I do TDD. Sometimes, however, I get a really good idea and jump straight to implementing it, because making a unit test for the idea would be too difficult to do.

In fact, I can see that the classes I did not TDD are the ones having the most problems right now!
I'm the same. Exactly the same actually. I think I agree with you on which classes are more problematic. I'm going to try and change my behaviour.
But if the behavior of the whole changes, you'll end up changing integration tests and unit tests, quite possibly redundantly. Is redundancy bad in unit testing? On one hand, you don't want cascading errors. On the other hand, you shouldn't remove a test case without good reason.
I've never really needed integration tests and where I felt I did I was able to change the design so that unit tests were sufficient. Redundancy in test cases is a bad thing. I learnt a while back to treat test cases with almost the same care and attention as the code they test.
Any wisdom in this area?
Should you not be writing your own expectations?