[cfe-dev] scan-build man page

Mon May 7 14:36:33 PDT 2012

On Mon, 07 May 2012 10:21:01 -0700
Ted Kremenek <kremenek at apple.com> wrote:

Hi Ted, 

> The man page looks really great.  

Thanks.  I'm glad to start a conversation with it.  

> My main concern, however, is
> supporting divergent documents.  Ideally we want the documentation
> for scan-build, the man page, etc., to all be in sync.  

Having maintained a 90-page user guide for 10 years, I understand that
concern.  I want to suggest, though, that "keeping it in sync" not as
big a deal as most people think.  Documentation is not very redundant
because it's labor-intensive, and labor, as you know, is scarce.
Therefore overlap is inherently self-limiting.  

Documentation is also not amenable to automation.  We want it to be; we
want the code to be self-documenting.  And good code (by definition)
is self-documenting.  But what we want from documentation isn't *in* the
code (or shouldn't be).  It has to be written.  Asking for
self-generating documentation is a bit like asking for self-generating
code.  

That said, the most tedious part of maintaining a reference manual is
documenting all the options and ensuring 1) all options are documented
and 2) all documented options are implemented.  Keeping the man page
synchronized with the implementation can, in principle, be automated.
But it doesn't follow that the man page must therefore be generated,
even in part; it might be better simply to have a system that compared
the two and reported on the differences.  For a small number of man
pages, that "system" might be just eyeballs, and some vigilance when
options are added/deleted.  

> My main concern about having a separate man page file is that someone
> is now responsible for keeping it up-to-date.  

True.  Documentation even introduces bugs, because undocumented
functionality never contradicted observed behavior.  ;-)

> It's a bit of engineering, but I'd prefer we go in a direction where
> the man page and the scan-build documentation on the website (or at
> least part of it) were machine generated from some common description

I would like to convince you that's both unnecessary and infeasible.  

First, it's an optimization of labor with labor, right?  And the rule
for optimization is to measure first.  How much do the man page and the
website have in common?  I don't see much, nor need for more.  

Even the obvious overlap -- command-line options -- doesn't warrant
wholesale duplication.  A guide properly presents some of the
options in an order chosen for ease of learning.  Rather than interrupt
the text with an exhaustive list of every option and its synonym, a
guide serves the user better by referring him to the reference manual
for complete details.  As soon as you're selecting *some* options in a
pedagogical order, you might as well just do it by hand.  The time
spent getting that information into a back-end database and building
the integration system will never be repaid.  

Keep in mind Vint Cerf's dictum, too.  If you reject documentation
because it's not in the "right" form, you restrict the number of
contributors.  Not everyone willing and able to document will be
interested in learning a specialized technology to do so.  

> It sounds like you had a fairly mechanical process for generating the
> man page (you took scan-build's output and manually post processed
> it).  Do you think we could automate this with a script, so the man
> page could just be a product of the build?  

Mechnical, yes, but it can't be automated.  The text from scan-build
lacks the very markers I added: the headings, that "model" is an
argument, that [=title] is optional, and so on.  

In principle the text could remain embedded in scan-build and extracted
to generate a man page.  But you'd have to re-invent half of -mdoc in
the process, without any improvement in the outcome.  Perldoc is a
good example.  

> Alternatively, since scan-build generates most of this text, maybe it
> could generate the options part of the man page itself (as an option
> to its output format), and have that output concatenated with some
> common preamble.  What do you think?

I would remove the help text from scan-build.  You don't need it
anymore.  "man scan-build" is easier to use, and everyone knows how.
Maintaining the man page is trivial next to the effort that goes into
the scanner.  

Besides, I'm allergic to so-called "help" that scrolls off my screen
and destroys the 23 lines of context I had before it took over.  Be
warned: for reasons science has been totally unable to explain, that
allergy has been observed to be contagious.  I think I got it from
Subversion.  

Someone will be tempted to suggest that if the documentation is amid
the code, the programmer will be more likely to keep it up to date.
That proposition is contradicted by vast collective experience.  We both
know huge projects with enviable (never perfect) documentation
maintained as man pages.  I don't know of any that owe their
documentation to how easy it is to maintain.  

Again: I do think that reference documentation can and should be
synchronized with the code.  Better tools could make that more
convenient than anything available today by reducing redundancy, by not
requiring, for instance, that function and argument names be restated.
Clang promises to make that possible for C++.  It's one of the reasons
I'm interested.  

Regards, 

--jkl