[PATCH] D41102: Setup clang-doc frontend framework

Mon Feb 19 08:52:51 PST 2018

sammccall added a comment.

Naming conventions tend to stick around for a while - `clang-doc/ClangDocXYZ.h` seems a bit unwieldy compared to `clang-doc/XYZ.h` - might be worth considering.

================
Comment at: clang-doc/ClangDoc.cpp:37
+  Context = Result.Context;
+  if (const auto *M = Result.Nodes.getNodeAs<NamespaceDecl>(BoundName))
+    processMatchedDecl(M);
----------------
if you're just going to call processMatchedDecl anyway, why pass in `BoundName` and allow it to vary rather than using a fixed string?

================
Comment at: clang-doc/ClangDoc.cpp:69
+
+std::string ClangDocCallback::getName(const NamedDecl *D) const {
+  if (const auto *F = dyn_cast<FunctionDecl>(D)) {
----------------
this needs a comment describing what/why it's doing.

In particular, you're usually using qnames, but sometimes mangled names?
Downstream consumers are going to have a very hard time doing something meaningful with that.
Where you need a stable, machine-readable identifier that distinguishes between overloads, I'd suggest USR (see USRGeneration).
Where you need something human readable, you should think carefully about the representation you want, and try to avoid depending on it being unique.

================
Comment at: clang-doc/ClangDoc.h:36
+/// Parses each match and sends it along to the reporter for serialization.
+class ClangDocCallback : public MatchFinder::MatchCallback {
+ public:
----------------
Having `ClangDocMain` responsible for building the MatchFinder, but `ClangDoc` responsible for implementing the callbacks seems like an odd choice for layering:
 - there's a deep implicit contract between the matchers and the callbacks, they are going to end up being tightly coupled so the split doesn't gain much
 - using MatchCallback as the interface exposes a detail you're quite likely to want to change. Some heavy users of ASTMatchers end up moving to explicit AST traversal for efficiency reasons.

It would seem cleaner to have the MatchFinder and collection of callbacks all owned by one class in `ClangDoc.cpp`, and just have `ClangDoc.h` expose a function that creates the `FrontendActionFactory` from it. This gives you a narrower interface with less implicit contracts, where ASTMatchers is an implementation detail of this TU.

================
Comment at: clang-doc/ClangDoc.h:38
+ public:
+  ClangDocCallback(StringRef BoundName, ExecutionContext &ECtx,
+                   ClangDocBinaryWriter &Writer)
----------------
Something seems slightly off here: we register a separate ClangDocCallback for each type of decl, but then each one detects what node it actually got...

There are a few ways to reduce this duplication:
 - (most reduction) use RecursiveASTVisitor, which naturally couples type and handling code (the matchers seem trivial, which makes this feasible)
 - use separate callbacks for each type (a ClangDocCallback<T>?)
 - (least reduction) create one callback and add it a bunch of times, or once with an anyof() matcher

================
Comment at: clang-doc/ClangDocBinary.h:1
+//===--  ClangDocBinary.h - ClangDoc Binary ---------------------*- C++ -*-===//
+//
----------------
(As well as a file comment, this two-word description is pretty confusing - binary used as a noun seems like it would refer to the compiled clang-doc tool itself)

================
Comment at: clang-doc/ClangDocBinary.h:26
+enum BlockId {
+  NAMESPACE_BLOCK_ID = bitc::FIRST_APPLICATION_BLOCKID,
+  NONDEF_BLOCK_ID,
----------------
nit: llvm style is e.g. `BI_NamespaceBlockID` with prefix or `BlockID::NamespaceBlockID` (using enum class)

================
Comment at: clang-doc/ClangDocBinary.h:158
+
+  template <typename T>
+  void writeBitstream(const T &I, BitstreamWriter &Stream,
----------------
again, this template really seems like it's a set of overloads.

================
Comment at: clang-doc/ClangDocMapper.cpp:148
+  }
+  return serialize(I);
+}
----------------
If I'm reading correctly, serialize() returns a SmallString by value, and now you're returning a (dangling) stringref to that temporary.

================
Comment at: clang-doc/ClangDocMapper.h:36
+
+class ClangDocCommentVisitor
+    : public ConstCommentVisitor<ClangDocCommentVisitor> {
----------------
why is this exposed?
(and what does it do?)

================
Comment at: clang-doc/ClangDocMapper.h:61
+
+class ClangDocMapper {
+ public:
----------------
Naming: as things stand, `ClangDoc` looks like the mapper, and this is some sort of serializer helper: ClangDoc consumes the input, decides what to do with it, and writes the output.

================
Comment at: clang-doc/ClangDocMapper.h:65
+
+  template <class C>
+  StringRef emitInfo(const C *D, const FullComment *FC, StringRef Key,
----------------
why is this a template method rather than a set of overloads?
I think if you pass in the wrong type, you'll get (at best) a linker error instead of a useful compile error.

================
Comment at: clang-doc/ClangDocMapper.h:66
+  template <class C>
+  StringRef emitInfo(const C *D, const FullComment *FC, StringRef Key,
+                     int LineNumber, StringRef File);
----------------
when returning a stringref, it might pay to be explicit about who owns the data, so the caller knows the safe lifetime.
(This isn't always spelled out in llvm, but should probably be done more often!)

================
Comment at: tools/clang-doc/ClangDoc.h:1
+//===-- ClangDoc.cpp - ClangDoc ---------------------------------*- C++ -*-===//
+//
----------------
sammccall wrote:
> This needs some high-level documentation: what does the clang-doc library do, what's the main user (clang-doc command-line tool), what are the major moving parts.
> 
> I don't personally have a strong opinion on how this is split between this header / the implementation / a documentation page for the tool itself, but we'll probably need *something* for each of those.
> 
> (I think it's OK to defer the user-facing documentation to another patch, but we should do it before the tool becomes widely publicized or included in an llvm release)
This comment is still relevant.

- `ClangDoc.h` in particular sounds like the API entrypoint, but the only thing that's documented here is an implementation detail. The file comment here should describe at a high level how documentation is extracted, combined, and output.
- most of the files have no file comment describing what the file is responsible for, what it interacts with etc. If I was contributing a patch here, how would I know whether a given header was the right layer for a new function?

Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D41102