[PATCH] D110665: [clang] Don't use the AST to display backend diagnostics

Reid Kleckner via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Sep 29 11:58:18 PDT 2021


rnk added inline comments.


================
Comment at: clang/lib/CodeGen/CodeGenAction.cpp:129
     SmallVector<LinkModule, 4> LinkModules;
+    std::vector<std::pair<llvm::hash_code, FullSourceLoc>>
+        ManglingFullSourceLocs;
----------------
As David says, this could be a plain hash map, maybe hash_code isn't hashable, but it must have some way to convert to a plain integer.

This also deserves a comment about why the key here is a hash code and not the full string. The purpose of this is to avoid storing another copy of all the mangled names, which are large. In the case of a hash collision, it is possible that the wrong location will be returned. The probability is roughly `#functions / size_t_max`, so pretty negligible, especially for a 64-bit platform. In any case, the consequences of a collision are that we use the wrong source location, which is a diagnostic quality issue, which isn't severe.

This would be a good use case for a fast content hash, such as blake3. I believe it would be really useful for LLD to have such a fast content hash for parellelizing symbol resolution and other deduplication actions, so I think it is worth a FIXME or TODO to use a fast content hash when one is available.


================
Comment at: clang/lib/CodeGen/CodeGenAction.cpp:641
+BackendConsumer::getFunctionSourceLocation(const Function &F) const {
+  auto Hash = llvm::hash_value(F.getName());
+  for (auto &Pair : ManglingFullSourceLocs) {
----------------
There are LLVM passes that will rename functions, so this lookup will fail for such functions, but I guess that was already the case in the old code. Nothing to do here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110665/new/

https://reviews.llvm.org/D110665



More information about the cfe-commits mailing list