[cfe-dev] [PATCH] LibTooling docs

Sat Apr 21 19:48:18 PDT 2012

On Sat, 21 Apr 2012 10:42:22 -0700
David Blaikie <dblaikie at gmail.com> wrote:

> On Sat, Apr 21, 2012 at 9:40 AM, James K. Lowden
> <jklowden at schemamania.org> wrote:
> 
> Hi James,

Hi David, 

I appreciate the time you took to address what I was saying.  

I may have misunderstood your post.  I thought your were suggesting a
general purpose tool and a general policy.  I'm not sure I want to
comment on one particular document.  I have my reservations about it,
but I don't want (my version of) the perfect be the enemy of the good.  

I also don't want to discourage the writing of documentation. As a
newcomer, I'm suffering from the dearth of documentation!  

If my ideas appeal to you, maybe they can save you some work.  If not,
well, the guy doing the work always gets priority over the guy
kibbitzing.  

> Judging this particular piece of documentation - it includes a
> character for character copy of code checked in & building in the
> Clang source tree. What scenario would exist where there would be any
> desire for this code to diverge, let alone in a way where the
> divergence led to the documentation version becoming invalid?

I wouldn't necessarily copy the code, and I wouldn't try to make every
code snippet compilable in situ.  Why not?  To let the documentation
omit lines unimportant to the point being made.  

For example, the place to document which header files are needed is in
the reference manual.  In a document of any size, #include directives
and error-handling code become a distraction from the main event.  

Similarly, the comments in the code can't serve two masters.  In
particular, tutorial comments are distracting and tiresome.  In-line
comments are best kept minimal; the code should speak for itself.
Comments should be used to draw attention to some important aspect,
perhaps an assumption about state or preconditions.  

(I realize not everyone agrees with these ideas.  There's nothing that
*can* be said on the subject that would meet with universal agreement.
But I can point you to programmers we both admire who say the same
thing.)  

How, then, to bring the code into the document, if not by copying it?
And how to verify it's right?  Scripts, naturally.  sed to extract the
lines from the code, and make(1) to merge them into the document
template, merge them into a compilable template, and compile the
result.  

The scripts aren't free, of course.  But neither is copying and pasting
the code whenever it's updated.  

> In both cases the surrounding documentation may still
> become incorrect & need to be fixed at some point.

Yes.  I think the problem of the English prose surrounding
the quoted code becoming outdated and thereby misleading is a far bigger
problem than that the quoted code will stop compiling.  

> Though, like doxygen, I think having the documentation closer to the
> sample wouldn't be a bad thing - it at least increases the
> chances/opportunity of updating the documentation

So they say.  Having done both, and having met up with a lot of awful
doxygen output (as I bet you have, too), I'd say the evidence is
against. Unix has a long and successful history of putting the man
pages in the same directory as the source code.  

I don't quite understand the theory that documentation doesn't get
written unless you rub the programmer's nose in it.  People who don't
like to do it don't, no matter what form it takes.  People who do
will.  

> > _With a bit of sed(1) the quoted
> > lines can be compared to *any* other revision in the repository, at
> > any time.

(Something in the ML software seems to be replacing "extra"
spaces with underscores.  Would you happen to know if that's considered
a feature?)  

> why would comparing
> the (checked in) documentation to arbitrary revisions be helpful?

To see if it's correct now, fsvo "now".  To see if the documentation
from the last release still holds for the RC we just branched, for
example.  

Documentation is correct as of version N.  Code continues to version 
N+1, N+2, ... N+n.  At some point, document and code diverge.  You
are saying that it must be fixed right then and there.  I'm saying that
to work effeciently often means that the documentation and code are out
of synch for a little while.  

> Clang documentation (even the live docs on the website for all of
> LLVM) are generated in commit hooks - they're generally intended to be
> kept totally in sync with the development work that is occurring.

Clang just released 3.1.  Perhaps today all its documentation is in
tip-top shape with every commit.  That's great!  In general, releases
involve a whole set of related changes, some deprecated ideas and some
new ones. With time, experience shows what's confusing and what didn't
work well.  As the number of documents grows, so too does their
potential to contradict each other.  

For these reasons I suggest documentation review is indeed
part of preparing a release, and that editing (and maintaining
something of single voice) is harder than making sure the
code samples they contain still compile.  

But I accept some of this is cultural.  I watched the release happen,
and I didn't notice anyone say, "Doc review!"  

> Perhaps you're thinking of a more formal product release cycle

I've never been paid to manage a release cycle or to write
official documentation.    I've just found in my own work that
pre-release tends to be the time to scrutinize the docs.  

And, yes, I agree documentation is normally written in arrears.  That's
probably just as well.  Clausewitz said, "The best plan does not
survive first contact with the enemy."  

> There's no reason the source code can't have test cases. 
...
> Testing doesn't always fall to humans

Sure, granted.  I was thinking more about how failures get rectified.  

> > Good documentation can't be forced or enforced. _It's the
> > product of conscientious work, nothing more and nothing less.
> 
> I don't think David's suggestion magically enforces good
> documentation, but it helps humans focus on the areas that machines
> can't (currently) help with. 

That depends on where you think the humans need help focussing.  For
me, coding and documentation are two loosely coupled processes
requiring different parts of the brain.  At time of commit -- and
definitely for any compilation -- I'm most concerned with things
technical.  Just because I've added a new parameter to a function
doesn't mean I have to touch up the documentation right now, or that I
can, or that I can do it well because it's 2:00 AM and I'm relieved
just to have checked in the code.  

Seeing them as loosely coupled naturally leads in the direction of
"trust but verify": a system that both tolerates out-of-date
documentation and can track what needs to be updated.  

But don't let me stand in your way (not that you were about to).  As I
said at the outset, I offer my perspective only to try to save you
from going down what I see as a blind alley.  I was a once a doxygen
fan, and I'm as big a fanatic of automating drudgery as you'll find.
The problem is, there are no shortcuts.   The real "test" of
documentation, we agree, is in its usability, in what it communicates.
Automation doesn't have much to offer to that end beyond the routine
work of lacing it together.  

Regards, 

--jkl