[PATCH] D43341: [clang-doc] Implement reducer portion of the frontend framework

Roman Lebedev via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sat Mar 24 11:43:16 PDT 2018


lebedev.ri added a comment.

It's good to finally have the initial block firmly landed, congratulations.

Trying to review this...
Some initial thoughts.



================
Comment at: clang-doc/BitcodeReader.cpp:27
+  assert(Record[0] == 20);
+  for (int I = 0, E = Record[0]; I < E; ++I)
+    Field[I] = Record[I + 1];
----------------
Ok, i don't understand what is going on here.
Where is this `E` defined?
This looks like `[0]` is the number of elements to read (always 20, sha1 blob?),
and it copies Record[1]..Record[20] to Field[0]..Field[19] ?
I think this can/should be rewritten clearer..


================
Comment at: clang-doc/BitcodeReader.cpp:19
+
+void ClangDocBitcodeReader::storeData(llvm::SmallString<4> &Field,
+                                      llvm::StringRef Blob) {
----------------
juliehockett wrote:
> lebedev.ri wrote:
> > I think all these `SmallString` can be one `llvm::SmallVectorImpl<char>`?
> No, since there's not an implicit converter from `llvm::SmallVectorImpl<char>` to `StringRef`. I templatized it on the size though, so it's only one function now.
Are you sure?
https://godbolt.org/g/wD1FKD

That isn't pretty, but i think it beats templating in this case..


================
Comment at: clang-doc/BitcodeReader.h:33
+
+  bool readBitstream(std::unique_ptr<InfoSet> &IS);
+
----------------
Similarly, i think this should be
```
bool readBitstream(InfoSet &IS);
```


================
Comment at: clang-doc/BitcodeReader.h:44
+  template <typename TInfo>
+  bool readBlockToInfo(unsigned ID, std::unique_ptr<InfoSet> &IS);
+
----------------
As far as i can see this should be
```
template <typename TInfo>
bool readBlockToInfo(unsigned ID, InfoSet &IS);
```



================
Comment at: clang-doc/BitcodeWriter.h:142
   void emitBlock(const CommentInfo &B);
+  void emitInfoSet(std::unique_ptr<InfoSet> &ISet);
 
----------------
And here too, it does not make much sense to call it with empty `std::unique_ptr<>`, so maybe
```
void emitInfoSet(InfoSet &ISet);
```
?


================
Comment at: clang-doc/Reducer.cpp:18
+std::unique_ptr<InfoSet> mergeInfos(tooling::ToolResults *Results) {
+  std::unique_ptr<InfoSet> UniqueInfos = llvm::make_unique<InfoSet>();
+  bool Err = false;
----------------
I can see that you may `return nullptr;` in case of error here, thus it's `std::unique_ptr<>`
`InfoSet` internally is just a few `std::vector<>`s, so it should `std::move()` efficiently.
I'm wondering if `llvm::Optional<InfoSet>` would work too?


================
Comment at: clang-doc/Representation.cpp:19
+  SymbolID USR;
+  std::string HexString = fromHex(StringUSR);
+  std::copy(HexString.begin(), HexString.end(), USR.begin());
----------------
I though this was changed, and the non-stringified, original binary version of sha1 was emitted into bitcode?


================
Comment at: clang-doc/Representation.cpp:28
+  if (L.Namespace.empty())
+    std::move(R.Namespace);
+  std::move(R.Description.begin(), R.Description.end(),
----------------
???
Where *to* is it moved?


================
Comment at: clang-doc/Representation.cpp:40
+
+static void mergeInfo(NamespaceInfo &L, NamespaceInfo &R) {
+  mergeInfoBase(L, R);
----------------
All these `mergeInfo()`: the second param, `Info &R` should probably be `Info &&R`.
(but not `mergeInfoBase()`/`mergeSymbolInfoBase()`)


================
Comment at: clang-doc/Representation.cpp:98
+    else                                                                       \
+      mergeInfo(X##s[R.first->second], I);                                     \
+  }
----------------
Seeing how many `std::move()`'ing is happening in those `mergeInfo()`, i think you want to move `I`, not pass as reference.
Especially since it is already `&&I` in this `InfoSet::insert()` function.


================
Comment at: clang-doc/Representation.h:30
 
 using SymbolID = std::array<uint8_t, 20>;
+SymbolID StringToSymbol(llvm::StringRef StringUSR);
----------------
Please add a comment explaining that `SymbolID` is sha1, and not hex-string of sha1.


================
Comment at: clang-doc/Representation.h:135
   Info() = default;
-  Info(Info &&Other) : Description(std::move(Other.Description)) {}
-  virtual ~Info() = default;
+  Info(Info &&Other)
+      : USR(Other.USR), Name(Other.Name), Namespace(std::move(Other.Namespace)),
----------------
Why is the move constructor explicitly defined?
Unless i'm missing something, the default one would do exactly the same.
https://godbolt.org/g/XqsRuX


================
Comment at: clang-doc/tool/ClangDocMain.cpp:130
+    llvm::outs() << "Writing intermediate results...\n";
+    SmallString<2048> Buffer;
+    llvm::BitstreamWriter Stream(Buffer);
----------------
That is the same small-size used in `static std::string serialize(T &I)`.
Some statistic is needed, bu i think this one can be bumped somewhat right away.



================
Comment at: clang-doc/tool/ClangDocMain.cpp:141
+      llvm::errs() << "Unable to create documentation directories.\n";
+      return 1;
+    }
----------------
This shares some code with `if(DumpMapperResult)`.
Perhaps you could refactor it slightly, to reduce code duplication?


https://reviews.llvm.org/D43341





More information about the cfe-commits mailing list