Page 1 of 1

Unit testing HTML without oodles of regexps

Posted: Mon Dec 26, 2005 9:16 pm
by Ambush Commander
So I have an HTMLPrinter that I'd like to unit test, but the problem is that the output it gives is tremendously complex and any character for character equality test would be an extremely fragile testcase. Of course, you can try regexps, but that doesn't seem... well... precise enough. Is there any alternative, or am I stuck with, to paraphrase, "oodles of regexps"?

Posted: Tue Dec 27, 2005 9:35 am
by dbevfat
What exactly does your HTMLPrinter do and what are you trying to test?

Posted: Tue Dec 27, 2005 6:00 pm
by Ambush Commander
It, as the name suggests, takes a story object and returns HTML string that would be put on the main page. Quite a bit of presentation logic going on in it.

Posted: Wed Feb 01, 2006 1:04 am
by Christopher
Did you ever find an alternative to "oodles of regexps"? Could you give an example so I can see more clearly how you are trying to test the output of your HTMLPrinter?

Posted: Wed Feb 01, 2006 7:51 pm
by Ambush Commander
Nope, never thought of a solution.

However, it occurs to me that the biggest problem is diregarding non-meaningful whitespace. You could run a few "trimming" functions to make the two "essentially" the same, and then compare them regularly. In a related manner, you do some very minor processing to turn the expectancy into a regexp as painlessly as possible. I don't see a way out though.

Posted: Wed Feb 01, 2006 11:15 pm
by Christopher
I don't think unit testers are the best way to validate the HTML. Perhaps you could run the HTML output of the Printer class through a HTML validator and then use the unit tester to examine the output of the validator.

Posted: Wed Feb 01, 2006 11:18 pm
by Ambush Commander
Rare are Printer outputs valid HTML documents. ;-) Did you mean HTML parser?

Posted: Wed Feb 01, 2006 11:37 pm
by JPlush76
personally I don't see the need to actually validate the html you're outputting with unit tests because as you say its very fragile. The goal of a unit test is to make sure the code is functioning properly with proper and improper input.

So with that said what I would do is make sure that my html printer object validates any parameters passed to it, responds properly to bad parameters, also mock it out to test that the methods are being called the appropriate number of times. I would save the actual data validation for a web acceptance test using simpletests webtester functionality.

Having not really seen your code if you have things like function addTitle() or function addFooter
those things that are constent should be able to be validated.

good luck

Posted: Thu Feb 02, 2006 1:08 am
by Christopher
Ambush Commander wrote:Rare are Printer outputs valid HTML documents. ;-) Did you mean HTML parser?
I don't know if I said it well, but my point was that there may be other programs that would be better to analyze your complex output. If the HTML lexer for the web tester you are using is not up to the job, then use a more sophisticate tool and check its output. Like JPlush76 I would need to see what the code and output looks like.