[PATCH] Documentation parsing: allow some commands to have multiple paragraphs attached

Sat Nov 23 04:35:50 PST 2013

Hi Dmitri,

On 23 Nov 2013, at 4:46 , Dmitri Gribenko <gribozavr at gmail.com> wrote:
> Hello,
> 
> Fariborz and I would like to propose a change to our current comment parsing
> model to allow multi-paragraph parameter and return value descriptions.
> 
> Please take a look and tell what you think.

Doxygen works differently than what you propose.
Doxygen has commands that have "top level" scope, like \param and \returns.
These commands:
- automatically end a brief description
- stop at the end of the paragraph (if the command has paragraph scope), 
  or at the next command with top level scope, whichever comes first.

I've recently introduced \parblock .. \endparblock to deal with the
case where a user actually wants to write multiple paragraphs at a place
where a single paragraph is expected.

> 
> Motivation
> ==========
> 
> Case 1
> ------
> 
> /// \param x1 Aaa.
> /// Aaa. Aaa.
> ///
> /// Bbb.
> ///
> /// \param x2 Ccc.
> void doSomething(int x1, int x2);
> 
> In this case, the user most likely intended "Bbb" to be a second paragraph of
> the \param x1.  But our current parsing model only allows a single paragraph to
> be attached to a block command, so we treat "Bbb" as a part of *function
> description*.  Because this is the first paragraph of the function description,
> "Bbb" also becomes the brief description.

Doxygen treats Bbb as part of the detailed description, since the first \param has already
ended the brief description. Bbb is not part of the \param's documentation since 
it is in the next paragraph. To get the behaviour you describe a user should write:

/// \param x1
/// \parblock
/// Aaa. Aaa. Aaa.
///
/// Bbb.
/// \endparblock
/// \param x2 Ccc.

> 
> Case 2
> ------
> 
> /// \returns
> /// \li Foo, or
> /// \li EnchancedFoo.
> Foo *makeFoo();
> 
> In this case, \returns and \li are block commands, so \returns has an empty
> paragraph, and \li points are separate from \returns and become a part of the
> function description.  Furthermore, "Foo, or" becomes the brief function
> description.
> 
> Proposed change to parsing model
> ================================
> 
> \param, \tparam and \returns commands consume paragraphs and block commands
> until we hit a command that is only allowed to appear at the top level.  For
> example:
> 
> /// \returns Either:
> /// \li Foo, or
> /// \li EnchancedFoo.
> /// \param isEnchanced Aaa.
> Foo *makeFoo(bool isEnchanced);
> 
> Everything starting from "Either:" until "\param" -- one paragraph and two \li
> commands -- become child nodes of the \returns command.  Because \param is a
> top-level-only command, we stop attaching children to \returns and return to
> top-level at that point.
> 
> Right now I have identified that it makes sense to allow only \li, \arg (alias
> of \li), and \verbatim-like commands to be nested within other commands.  All
> other block commands are top-level-only.

Doxygen's \li command is not a command that has "top-level" scope, so it can indeed
be nested inside a \param, or \returns.

The command does has paragraph scope, so it ends at the next paragraph.

Example:
/// \returns Either
/// \li First item
/// \li Second item
///
/// This text ends the list but not the returns section
///
/// Top level text continues outside of returns

Note that for automatic lists the indentation of the paragraph 
determines the end of a list item:

Example:
/// A list:
/// - item 1
///   - sub item 1
///   - sub item 2
///
///     text of sub item 2 continues...
///
///   text of item 1 continues...
/// - item 2
///
///   More text for item 2.
///
/// Text after the list

> 
> What comments will parse differently
> ====================================
> 
> Comments where the user placed the long description after parameter or return
> value description will parse differently.  For example:
> 
> /// \param x1 Aaa.
> ///
> /// This functions does...
> void foo(int x1);
> 
> "This function does..." used to be a brief description, now it is the second
> paragraph of parameter description.
> 
> One can get the previous behavior again by using explicit \brief or \details
> commands, depending on the intent:
> 
> /// \param x1 Aaa.
> ///
> /// \brief This function does...
> 
> How Doxygen handles this
> ========================
> 
> As far as I see, in its output, Doxygen preserves the sequence of paragraphs,
> and it also does not try to assign semantic meaning to paragraphs.  Because of
> this, Doxygen will not hit any issues regardless whether it uses the original
> Clang's parsing model or this proposed model -- it does not make a differece
> for the output that Doxygen produces.

I think I've explained that it does differ. It would make it harder for users
to write documentation that works well with clang and doxygen. So I hope it
is possible to make the implementation more in line with the way
doxygen processes comments. Let me know if I can help.

Regards,
  Dimitri