Documentation generator - project proposal
Posted: Wed Oct 21, 2009 11:55 am
I have this thing about keeping methods really terse, like rarely over 25 lines of SLOC, on average they probably fall around 10-15 lines. I find this keeps code self-describing, despite increase indirection. For this reason, I absolutely despie writing docs inline using phpDocumentor, comments for this purpose I find convolute source code, which for the most part is already self-describing.
Just following method calls will quickly give you an understanding of the interactions of a sub-system.
I do comment, however I keep the comments very brief -- no more than a line.
1. Comment caveats
2. Comment side effects
The latter I try and reduceéminimize by keeping methods-objects as state-less as possible -- a practice I picked up from reading up on functional programming.
There are still situations where a few comments are needed though, just to help clarify a section of 3-4 lines which do something complex but do not warrant being refactored into a method.
Without the comments I would find the two lines confusing and although nothing is likely to change in them, having the high level of commentary quickly tells me what the lines are doing; Entirely subjective I know but documenting using comments is an art as 
Those are literally the only comments you will see in my code, no phpDocumentor, etc.
If I need a listing of classes, methods, etc I run Doxygen which extracts that info from the source code.
---
Here is what I am thinking (and proposing for a new project).
Ideally I would love to keep documentation external to the source (despite decades of others preeching bad practice):
1. I find coming back to code at a later time and documenting, much more elaborate and accurate. Writing comments while trying to solve a problem tends to result in quick descriptions which make sense at the time of writing but don`t translate well when reading as a new comer to a codebase. Whereas documenting code you have to step through to understand, forces you to comment from a perspective of a naive developer.
WHat I am thinking, is having the documentation written in a makrup such as markdown, which can be easily converted into HTML, maybe PDF, etc.
Each class would map to it`s own folder and each method to it`s own markdown file. Using a file structure-system to resemble something of a table of contents, so full PDF documentation can be easily built, or perhaps HTML output.
Having external documentation raises one interesting problem, that is the issue of synchronization between docs and interfaces (implementation docs I avoid like the plague, my reasoning is, if your tinkering with implementation, you will need to step through the code line by line and figure it out manually anyways, so gentle spinkling of line comments should suffice with self-describing code).
What I am thinking is, implement a parser to scan the source tree of a project and particularly methods. If the signature changes or the implementation, you would raise a red flag so the next person to visit the doc manager would be notified and could synchronize the docs to the new implementation. This solution raise a couple of interesting problems:
1. Whitespace. Changes made to whitespace should not raise flags
2. Variable naming should not raise red flags
Only interface, structural changes, re-factorings, etc should possibly notify the documentation writer that the docs do not accurately reflect the new implementation.
One might re-factor a method (factoring out a section of code) and introduce another private method, but not change the purpose of the method at all, in which case it would be up to the doc writer to determie whether changes were made.
The important thing is devising a system so that changes made to any method (outside of white space, variable names changes) raise a red flag, perhaps by comparing method interface-implementation to a previous MD5 (with whitespace removed and all variables renamed to a single name).
I find it so time consuming writing docs inside the phpDoc blocks, not having any WYSIWYG or similar editor, ot to mention the convoluted feeling I get when sifting through dozens and dozens of lines of comments -- which many times is equally unsycnronized with the intent of the source code.
I`m thinking build the system as a series of CLI scripts (no framework, MVC, just input-process-output) and later build a web based interface or possibly invoke a master CLI script through Eclipse to generate HTML or PDF documentation.
What do you feel of this idea...would you be interested in possibly collaborating a few hours a week and begin implementing something prototypical, re-factoring as we go until we have something concrete.
Cheers,
Alex
Just following method calls will quickly give you an understanding of the interactions of a sub-system.
I do comment, however I keep the comments very brief -- no more than a line.
1. Comment caveats
2. Comment side effects
The latter I try and reduceéminimize by keeping methods-objects as state-less as possible -- a practice I picked up from reading up on functional programming.
There are still situations where a few comments are needed though, just to help clarify a section of 3-4 lines which do something complex but do not warrant being refactored into a method.
Code: Select all
// NOTE:
// 1. Prefix with request scheme and replace domain placeholder with request domain and trim trailing slash (if any)
// 2. Extract all named markers/placeholders
$format = trim(sprintf('%s://%s', REQUEST_SCHEME, str_replace('{*}', REQUEST_DOMAIN, $format)), '/');
preg_match_all('/\{([a-z]+)\}/', $format, $markers);Those are literally the only comments you will see in my code, no phpDocumentor, etc.
If I need a listing of classes, methods, etc I run Doxygen which extracts that info from the source code.
---
Here is what I am thinking (and proposing for a new project).
Ideally I would love to keep documentation external to the source (despite decades of others preeching bad practice):
1. I find coming back to code at a later time and documenting, much more elaborate and accurate. Writing comments while trying to solve a problem tends to result in quick descriptions which make sense at the time of writing but don`t translate well when reading as a new comer to a codebase. Whereas documenting code you have to step through to understand, forces you to comment from a perspective of a naive developer.
WHat I am thinking, is having the documentation written in a makrup such as markdown, which can be easily converted into HTML, maybe PDF, etc.
Each class would map to it`s own folder and each method to it`s own markdown file. Using a file structure-system to resemble something of a table of contents, so full PDF documentation can be easily built, or perhaps HTML output.
Having external documentation raises one interesting problem, that is the issue of synchronization between docs and interfaces (implementation docs I avoid like the plague, my reasoning is, if your tinkering with implementation, you will need to step through the code line by line and figure it out manually anyways, so gentle spinkling of line comments should suffice with self-describing code).
What I am thinking is, implement a parser to scan the source tree of a project and particularly methods. If the signature changes or the implementation, you would raise a red flag so the next person to visit the doc manager would be notified and could synchronize the docs to the new implementation. This solution raise a couple of interesting problems:
1. Whitespace. Changes made to whitespace should not raise flags
2. Variable naming should not raise red flags
Only interface, structural changes, re-factorings, etc should possibly notify the documentation writer that the docs do not accurately reflect the new implementation.
One might re-factor a method (factoring out a section of code) and introduce another private method, but not change the purpose of the method at all, in which case it would be up to the doc writer to determie whether changes were made.
The important thing is devising a system so that changes made to any method (outside of white space, variable names changes) raise a red flag, perhaps by comparing method interface-implementation to a previous MD5 (with whitespace removed and all variables renamed to a single name).
I find it so time consuming writing docs inside the phpDoc blocks, not having any WYSIWYG or similar editor, ot to mention the convoluted feeling I get when sifting through dozens and dozens of lines of comments -- which many times is equally unsycnronized with the intent of the source code.
I`m thinking build the system as a series of CLI scripts (no framework, MVC, just input-process-output) and later build a web based interface or possibly invoke a master CLI script through Eclipse to generate HTML or PDF documentation.
What do you feel of this idea...would you be interested in possibly collaborating a few hours a week and begin implementing something prototypical, re-factoring as we go until we have something concrete.
Cheers,
Alex