I need to refactor my unit tests...

Discussion of testing theory and practice, including methodologies (such as TDD, BDD, DDD, Agile, XP) and software - anything to do with testing goes here. (Formerly "The Testing Side of Development")

Moderator: General Moderators

User avatar
Jenk
DevNet Master
Posts: 3587
Joined: Mon Sep 19, 2005 6:24 am
Location: London

Post by Jenk »

Your test cases can often require more refactoring than your model objects do. This includes removal all together for a design aspect that you come to realise is redundant.

The alleged beauty of that (and I concur with) is that refactoring your tests is less work than refactoring your object. Once you have a test that meets your requirements, you have a set goal for your objects. From this point on, the only goal you have is to make Object A pass TestCase A. No worrying about what would be more beneficial to other objects, and the relationships, no pondering over "Should I couple these directly" as that will prevent you achieving your goal.

The Factory pattern is a TDD developers best friend. Easy to mock :P
User avatar
Maugrim_The_Reaper
DevNet Master
Posts: 2704
Joined: Tue Nov 02, 2004 5:43 am
Location: Ireland

Post by Maugrim_The_Reaper »

I think I used the term badly at the time. Basically there are three broad levels of testing (depending on your development approach). Unit Testing tests a single class, Integration Testing tests a collection of related classes for a process (i.e. ensures all those classes play nice together). Acceptance Testing tests a release/iteration to ensure it meets customer expectations/requirements (i.e. usually web testing like SimpleTest's web tests or it's new (check CVS) Selenium Test Case type which proxies to the Selenium RC to operate a full browser rather than a HTTP Client).

If you follow Extreme Programming, the development process is largely driven by gathering requirements in the form of User Stories. A single story is a customer describing a process (e.g. When I input the wrong password or an invalid username into the login form I want to see an error message). An acceptance test is the final judge on whether that story has been implemented (e.g. use a Selenium test case to submit a bad password/username to the login form and verify an expected error message exists in the response). User stories/Acceptance Tests don't give a crap about the underlying programming language or it's design - just the end result of a functional application, the HTML rendered by a browser.

Both SimpleTest and PHPUnit has Selenium integration at this stage. SimpleTest's is still in CVS until the next major release.

There's a recent write up of all this over on the DevNetStore forums where we're kickstarting an XP approach. Check the Misc. forum.
It's interesting that you should mention that. Most of the time I do TDD. Sometimes, however, I get a really good idea and jump straight to implementing it, because making a unit test for the idea would be too difficult to do.
The trick is knowing when your inspired awesome code is purely temporary. If it's not testable then it's going to be hard to refactor and tack on regression tests for any bugs you encounter. I'm not saying the original code is useless - but it should be regarded as throw away code. Then run off an TDD a final testable version using the original as a reference.
But if the behavior of the whole changes, you'll end up changing integration tests and unit tests, quite possibly redundantly.
Such is the nature of the beast :). They shouldn't be redundant though. If tests have redundancy then your tests are defying DRY which is an evil practice they burn people at the stake for.
ole wrote:I've never really needed integration tests and where I felt I did I was able to change the design so that unit tests were sufficient. Redundancy in test cases is a bad thing. I learnt a while back to treat test cases with almost the same care and attention as the code they test.
The problem is that Unit Tests focus on one class in isolation using Mocks to replace other dependent classes. Sometimes a lot of nefarious bugs only pop up when you string together 15 classes to perform a function. Maybe Class A is passing an Integer to Class B, where Class B has assumed it will receive a String. In most cases this is perfectly valid. Unfortunately there are functions which really dislike Integers, e.g. ctype_digits() which will kick up a fuss. That kind of interface issue only gets caught using an Integration Tests - the individual Unit Tests are less likely to catch it.
User avatar
Jenk
DevNet Master
Posts: 3587
Joined: Mon Sep 19, 2005 6:24 am
Location: London

Post by Jenk »

Don't forget exploratory testing, but that can't be feasibly completed by a machine :P
lastcraft
Forum Commoner
Posts: 80
Joined: Sat Jul 12, 2003 10:31 pm
Location: London

Re: I need to refactor my unit tests...

Post by lastcraft »

Hi...
Ambush Commander wrote:Errors in one component result in cascading error rushes in others (tightly coupled components coming back to bite me in the can).
This is probably a code problem. What is the language of your domain?
Ambush Commander wrote: Single test cases can run up to a 100 lines long.
Do you have an example? I get worried at more than ten lines.
Ambush Commander wrote: Running the test suite takes the better part of twenty seconds.
Tests for a single component in our application often take several minutes. A full test run of the entire app. takes half an hour. This is all the tests though. The unit tests take just under a minute.
Ambush Commander wrote: Maintaining the tests takes longer than writing actual code.
Usually about the same for me. Possibly more in special cases, such as in rules for sequences of actions where you must test several paths. Accountancy systems generate a lot of tests for this reason.
Ambush Commander wrote: This will not do. I must refactor my unit tests.
Do you have an example? A place where we could start?

It's also possible that there is too much mocking happening. They are very powerful, which leads down the tempting path of highly specified tests. "Mock-tastic" as it's sometimes known. Overspecified tests are not only more work to write, but they tend to be fragile.

Another possibility is that parts of your problem are rightly solved with tightly coupled classes in small clusters. Thus your "unit" should be the cluster, not the individual classes. A parser and it's lexer are a good example. They work together to solve a problem and the information flow between them is dense and technical. They have a fat interface between them in effect. Testing works best with small interfaces, so it's probably best to test lexers and parsers as pairs.

yours, Marcus
User avatar
Ambush Commander
DevNet Master
Posts: 3698
Joined: Mon Oct 25, 2004 9:29 pm
Location: New Jersey, US

Post by Ambush Commander »

Hi Marcus, delighted to see you drop by!
This is probably a code problem. What is the language of your domain?
HTML. The specific application is HTML Purifier.
Do you have an example? I get worried at more than ten lines.
They come in several different types.

- Long test-case because creating individual functions would be extremely verbose
- Long test-case due to initialization code
- Long test-case due to acceptance-style testing
Tests for a single component in our application often take several minutes. A full test run of the entire app. takes half an hour. This is all the tests though. The unit tests take just under a minute.
Well, I guess I can't complain then. I like immediate feedback, so I'm literally running test-cases after a one or two line change.
Do you have an example? A place where we could start?
Out of all of them, this one is probably the most flagrant: URITest.php. I'm not sure I completely understand what's going on there myself, and I've often had to relax the test cases due to PHP4 quirkiness. The entire setup is convoluted because I am attempting to mock the internal subsystems and localize the tests, which is supposed to be a good thing.
It's also possible that there is too much mocking happening. They are very powerful, which leads down the tempting path of highly specified tests. "Mock-tastic" as it's sometimes known. Overspecified tests are not only more work to write, but they tend to be fragile.
Possibly. I've noticed when I use mocks things tend to get fragile and break a lot, but they also keep me honest, because when they break, generally something else needs to be changed too.

How does one know when a test is overspecified? On one hand, you want to make sure all the functionality is tested, on the other hand, you don't want the test to be fragile.
Another possibility is that parts of your problem are rightly solved with tightly coupled classes in small clusters. Thus your "unit" should be the cluster, not the individual classes. A parser and it's lexer are a good example. They work together to solve a problem and the information flow between them is dense and technical. They have a fat interface between them in effect. Testing works best with small interfaces, so it's probably best to test lexers and parsers as pairs.
In effect, stop worrying about the cascading error rushes! The Strategy unit communicates with the HTMLDefinition unit in order to receive information on what it needs to do. Generally, the interchange is extremely complicated, so I have been treating the pair as a single unit and shyed away from mocking. This means that if there is a bug, tests fail all over the place, but at least I know that there is something wrong.
lastcraft
Forum Commoner
Posts: 80
Joined: Sat Jul 12, 2003 10:31 pm
Location: London

Post by lastcraft »

Hi...
Ambush Commander wrote:Hi Marcus, delighted to see you drop by!
No problem!
Except that you have a load of comments within the test case. If the comments became the test method names, then it would remain about the same size wouldn't it? I like the refactoring of assertDef(), but what does "Def" mean? Otherwise it all looks cool to me.
This is almost stylistic preference, but I'd write it like this...

Code: Select all

require_once 'HTMLPurifier/DefinitionCache.php';
require_once 'HTMLPurifier/ShortNames.php';   // 

class DefinitionCacheTest extends UnitTestCase {
    private $old_schema;

    function setUp() {
        $this->old_schema = ConfigSchema::instance();
        ConfigSchema::instance(new ConfigSchema());
        ConfigSchema::defineNamespace('Test', 'Test namespace');
        ConfigSchema::define('Test', 'DefinitionRev', 1, 'int', 'Definition revision.');
    }

    function tearDown() {
        ConfigSchema::instance($this->old_schema);
    }

    function config() {
        generate_mock_once('Config');
        $config = new ConfigMock();
        $config->version = '1.0.0';                        // hopefully no conflicts
        $config->setReturnValue('get', 10, array('Test', 'DefinitionRev'));
        $config->setReturnValue('getBatchSerial', 'hash', array('Test'));
        return $config
    }

    function cache() {
        return new HTMLPurifier_DefinitionCache('Test');
    }

    function test???() {
        $this->assertFalse($this->cache()->isOld('1.0.0-hash-10', $this->config()));
        $this->assertTrue($this->cache()->isOld('1.5.0-hash-1', $this->config()));
    }

    function test???() {        
        $this->assertTrue($this->cache()->isOld('0.9.0-hash-1', $this->config()));
    }

    function test???() {        
        $this->assertTrue($this->cache()->isOld('1.0.0-hash-1', $this->config()));
    }

    function test???() {        
        $this->assertTrue($this->cache()->isOld('1.0.0beta-hash-11', $this->config()));
    }

    function test???() {        
        $this->assertTrue($this->cache()->isOld('0.9.0-hash2-1', $this->config()));
        $this->assertFalse($this->cache()->isOld('1.0.0-hash2-1', $this->config()));
        $this->assertTrue($this->cache()->isOld('1.0.0beta-hash2-11', $this->config()));
    }    
}
I would have pulled the assertTrue() and assertFalse() lines into a more meaningful assertion name, but I don't know what they are doing. How would you describe them? What would you call the "???" here? I feel I am missing something vital. That said, i don't know a lot about HTML purification :).
Perhaps pull out chunks of set up into methods. Also there are often two or three tests rolled into one.

It's only fair that I put one of mine in the firing line...

Code: Select all

require_once(dirname(__FILE__) . '/accounts_test_case.php');

class TestManualAccountSetUp extends AccountsTestCase {
    ...
    function testCanListPaymentsAgainstAnAccount() {
        $this->nowJuly(1, 2007);
        $account = $this->createFred();
        $this->nowJuly(2, 2007);
        $purchase1 = $this->accounts()->purchaseManually(
                $account, 'One year subscription', 200, 'GBP');
        $this->accounts()->payByCheque($purchase1, 200, 'GBP');
        $this->nowJuly(3, 2007);
        $purchase2 = $this->accounts()->purchaseManually(
                $account, 'Top 10,000 keywords report', 50, 'USD');
        $this->accounts()->payByCheque($purchase2, 50, 'USD');
        $payments = $this->accounts()->findPaymentsAgainstAccount($account);
        $this->assertEqual(
                $this->justFields($payments, array('method', 'amount', 'currency')),
                array(array('method' => 'Cheque', 'amount' => 50, 'currency' => 'USD'),
                      array('method' => 'Cheque', 'amount' => 200, 'currency' => 'GBP')));
    }
    ...
}
...
There are about five test cases in the file, each with about a half dozen to a dozen integration tests like the above. The accounts model has about a dozen such test scripts.

The lines starting with...

Code: Select all

$this->accounts()->...
...create the AccountsModel class with the correct DB driver. This and the justFields() method exist in the AccountsTestCase.

This is mostly a DB driven app. (the AccountsModel methods map to stored procs), and so testing parts of result sets is common. the justFields() methods extracts a slice for testing to avoid making the test dependent on the rest of the data. This is what I mean by avoiding over testing.

The entire DB schema is replaced on every test. This action lives in the AccountsTestCase too. Setting the dat is necessary because the product line can change in the future and so invalidate current tests.

This was the longest test method I could find in the entire Accounts module.
Well, I guess I can't complain then. I like immediate feedback, so I'm literally running test-cases after a one or two line change.
Me too. I'll run just the current test file for each edit (10 secs), the suite for the module often (1-3 mins) and all_tests.php (20 mins+) before a check in.
How does one know when a test is overspecified? On one hand, you want to make sure all the functionality is tested, on the other hand, you don't want the test to be fragile.
Right. You have to only test for the feature you are checking and nothing else.

For example, if you want to assert that the persons name appears in the title bar of a web page, then this is sufficient...

Code: Select all

$this->assertTitle(new PatternExpectation('/fred/i'));
...rather than...

Code: Select all

$this->assertTitle('My application - welcome Fred!');
In effect, stop worrying about the cascading error rushes!
Oh no, definitely worry about that :). Just don't make it the fault of the test suite. I feel that testing each feature in isolation trumps testing each class in isolation for this reason.

Say you are testing a HTML tag stripper...

Code: Select all

function assertStrippedAs($html, $text) {
    $stripper = new HtmlSaxParser(new TextOnlyBuilder());
    $this->assertEqual($parser->parse($html), $text);
}

function asserPreserved($text) {
    $this->assertStrippedAs($text, $text);
}

function testEmptyDocumentPreserved() {
    $this->assertPreserved('');
}

function testWhenNoTagsThenDocumentPreserved() {
    $this->assertPreserved('Just text');
}

function testSingleLineCommentStripped() {
    $this->assertStrippedAs('<!-- junk -->', '');
    $this->assertStrippedAs('Hi<!-- junk -->', 'Hi');
    $this->assertStrippedAs('<!-- junk -->There', 'There');
}

function testSingleLineCommentStrippedAndSpaceAddedIfNeeded() {
    $this->assertStrippedAs('Hi<!-- junk -->There', 'Hi There');
    $this->assertStrippedAs('Hi <!-- junk -->There', 'Hi There');
    $this->assertStrippedAs('Hi<!-- junk --> There', 'Hi There');
    $this->assertStrippedAs('Hi<!-- junk --> There', 'Hi  There');
}
...
If something fundamental fails, then everything will fail. That's not something you would check in though, and it's obvious you just broke it. Subtler errors will be caused by people trying to change your code. They may still get quite a few failures, but fewer if they were just modifying an existing feature.

Whilst changing the mechanics, it's probably best to have a small unit test that tests just one tag and run only that. That small test can have the ithe internal mocking and the like. That way the mechanics test are factored out from the semantic ones.

yours, Marcus
Post Reply