[PATCH] D13107: Support for function summary index bitcode sections and files. Includes bitcode reader/writer support for these sections, and support for creation of combined index files.

Mehdi AMINI via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 30 22:33:35 PDT 2015


joker.eph added a comment.

Hi Teresa,

Please find some inline comments below. I had some minor naming suggestion, but more important is the ownership of some object instantiation.


================
Comment at: include/llvm/Bitcode/BitcodeWriterPass.h:54
@@ -51,1 +53,3 @@
+      : OS(OS), ShouldPreserveUseListOrder(ShouldPreserveUseListOrder),
+        EmitThinLTOIndex(EmitThinLTOIndex) {}
 
----------------
Rename //ThinLTO// with //FunctionSummary// to be coherent with the rest like //hasFunctionSummary()// for instance?

I think there are multiple places (not to say everywhere) where //ThinLTO// could be replaced with something more generic like //FunctionSummary//.

================
Comment at: include/llvm/Bitcode/BitstreamWriter.h:24
@@ -23,2 +23,3 @@
 #include "llvm/Support/Endian.h"
+#include "llvm/Support/raw_ostream.h"
 #include <vector>
----------------
Why?

================
Comment at: include/llvm/Bitcode/BitstreamWriter.h:528
@@ -523,1 +527,3 @@
+public:
+
   BlockInfo &getOrCreateBlockInfo(unsigned BlockID) {
----------------
Shouldn't be needed?

================
Comment at: include/llvm/Bitcode/ReaderWriter.h:74
@@ +73,3 @@
+      DiagnosticHandlerFunction DiagnosticHandler,
+      bool ParseFuncSummary = true);
+
----------------
Might be more explicit to rename //ParseFuncSummary// into //IsLazy//. 
(just a suggestion, do whatever seems better to you)

================
Comment at: include/llvm/IR/FunctionInfo.h:45
@@ +44,3 @@
+  /// have module identifier appended before placing into the combined
+  /// index, to disambiguiate from other functions with the same name.
+  ///
----------------
s/disambiguiate/disambiguate/ ?


================
Comment at: include/llvm/IR/FunctionInfo.h:99
@@ +98,3 @@
+/// of the corresponding function block. For the combined index,
+/// it, after parsing of the \a ValueSymbolTable, this initially
+/// holds the offset of the corresponding function summary bitcode
----------------
The "it" at the beginning of the line seems strange to me?

================
Comment at: include/llvm/IR/FunctionInfo.h:107
@@ +106,3 @@
+  /// decisions.
+  FunctionSummary *Summary;
+  /// \brief The bitcode offset corresponding to either the associated
----------------
Who owns it? 
(I may found out later)

================
Comment at: include/llvm/IR/FunctionInfo.h:108
@@ +107,3 @@
+  FunctionSummary *Summary;
+  /// \brief The bitcode offset corresponding to either the associated
+  /// function's function body record, or its function summary record,
----------------
(Missing empty line)

================
Comment at: include/llvm/IR/FunctionInfo.h:130
@@ +129,3 @@
+  FunctionInfo(uint64_t FuncOffset,
+               FunctionSummary *FuncSummary) :
+    Summary(FuncSummary),
----------------
If this takes ownership, use a `unique_ptr` for the argument.

================
Comment at: include/llvm/IR/FunctionInfo.h:136
@@ +135,3 @@
+    if (Summary)
+      delete Summary;
+  }
----------------
Any reason not to use `unique_ptr` for the Summary member?

================
Comment at: include/llvm/IR/FunctionInfo.h:143
@@ +142,3 @@
+    // destruction doesn't perform a double-delete.
+    Summary = nullptr;
+    BitcodeIndex = Other.BitcodeIndex;
----------------
This sounds suspicious: a copy ctor that does not really copy... 
I have to see the use case first, I'm skeptical right now :)

================
Comment at: include/llvm/IR/FunctionInfo.h:166
@@ +165,3 @@
+  /// the index type.
+  void bitcodeIndex(uint64_t FuncOffset) {
+    BitcodeIndex = FuncOffset;
----------------
The other setters name are beginning with //set//, why not this one?

================
Comment at: include/llvm/IR/FunctionInfo.h:190
@@ +189,3 @@
+class FunctionInfoIndex {
+ public:
+
----------------
remove

================
Comment at: include/llvm/IR/FunctionInfo.h:213
@@ +212,3 @@
+           FMI->getValue().begin(); FLI != FMI->getValue().end(); ++FLI) {
+        delete *FLI;
+      }
----------------
Why not using `unique_ptr` and remote the destructor? 

================
Comment at: include/llvm/IR/FunctionInfo.h:223
@@ +222,3 @@
+
+  const_funcinfo_iterator funcinfo_begin() const {
+    return FunctionMap.begin();
----------------
If it was called "begin()" you could use for-range loop. An alternative would be to add another method `functions()` that would return an `llvm::iterator_range`.

================
Comment at: include/llvm/IR/FunctionInfo.h:232
@@ +231,3 @@
+  FunctionInfoList getFunctionInfoList(StringRef FuncName) const {
+    return FunctionMap.lookup(FuncName);
+  }
----------------
Do you really intend to return a copy of the vector all the time?

================
Comment at: include/llvm/Object/FunctionIndexObjectFile.h:84
@@ +83,3 @@
+  static bool hasFunctionSummaryInMemBuffer(MemoryBufferRef Object,
+                                          LLVMContext &Context);
+
----------------
indentation

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:410
@@ +409,3 @@
+
+  /// \brief Used to indicate whether we are doing lazy parsing of summary data.
+  ///
----------------
What about a variable name that contains `Lazy`?

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:437
@@ +436,3 @@
+  /// summary record offset).
+  DenseMap<uint64_t, FunctionSummary*> SummaryMap;
+
----------------
The ownership/lifetime of FunctionSummary is not clear here?

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:3110
@@ -3016,2 +3109,3 @@
         }
+        //assert(false); // Temp assert to make sure we don't fall back to old code.
         break;
----------------
remove

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:5215
@@ +5214,3 @@
+        if (Entry.ID == bitc::FUNCTION_SUMMARY_BLOCK_ID)
+          SeenFuncSummary = true;
+        if (Stream.SkipBlock())
----------------
Should you just return now?

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:5241
@@ +5240,3 @@
+        }
+        else if (std::error_code EC = parseEntireSummary())
+          return EC;
----------------
I would expect clang-format to put the else on the same line as the `{`?

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:5313
@@ +5312,3 @@
+  }
+  llvm_unreachable("Exit infinite loop");
+}
----------------
I don't see the expected return for this function?

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:5343
@@ +5342,3 @@
+      break;
+      case bitc::MST_CODE_ENTRY: {
+        // MST_ENTRY: [modid, namechar x N]
----------------
The indentation seems incorrect, is it clang-formatted?

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:5382
@@ +5381,3 @@
+      // We didn't really read a proper Module block.
+      return error("Malformed ThihLTO IR file");
+    }
----------------
Typo with ThihLTO. 
(also I'm still not sure at which level where the "ThinLTO" concept should "leak").

================
Comment at: lib/Bitcode/Reader/BitcodeReader.cpp:5684
@@ +5683,3 @@
+  std::unique_ptr<MemoryBuffer> Buf = MemoryBuffer::getMemBuffer(Buffer, false);
+  FunctionIndexBitcodeReader *R =
+      new FunctionIndexBitcodeReader(Buf.get(), Context, DiagnosticHandler);
----------------
When is this memory released?

================
Comment at: lib/Bitcode/Writer/BitcodeWriter.cpp:2304
@@ +2303,3 @@
+         FII->getValue().begin(); FLI != FII->getValue().end(); ++FLI) {
+      FunctionInfo *FI = *FLI;
+
----------------
For-range loops?

================
Comment at: lib/Bitcode/Writer/BitcodeWriter.cpp:2323
@@ +2322,3 @@
+           *E = FuncName.data()+FuncName.size(); P != E; ++P)
+        NameVals.push_back((unsigned char)*P);
+
----------------
`for(auto P : FuncName)` ?

================
Comment at: lib/Bitcode/Writer/BitcodeWriter.cpp:2384
@@ +2383,3 @@
+
+  FunctionIndex[&F] = FuncInfo;
+}
----------------
Ownership is not clear, what about having a `DenseMap<const Function *, std::unique_ptr<FunctionInfo>>` instead?

================
Comment at: lib/Bitcode/Writer/BitcodeWriter.cpp:2693
@@ +2692,3 @@
+    for (const char *P = MPSE.getKey().data(),
+         *E = MPSE.getKey().data()+MPSE.getKey().size(); P != E; ++P)
+      NameVals.push_back((unsigned char)*P);
----------------
`for (auto P : MPSE.getKey())` ?

================
Comment at: lib/Bitcode/Writer/BitcodeWriter.cpp:2720
@@ +2719,3 @@
+  for (DenseMap<const Function *, FunctionInfo *>::iterator I =
+       FunctionIndex.begin(); I != FunctionIndex.end(); I++) {
+    // Skip anonymous functions. We will emit a function summary for
----------------
For-range loop?

================
Comment at: lib/Bitcode/Writer/BitcodeWriter.cpp:2757
@@ +2756,3 @@
+    NameVals.clear();
+  }
+
----------------
Some refactoring seems possible between these two loops.

================
Comment at: lib/Bitcode/Writer/BitcodeWriter.cpp:2780
@@ +2779,3 @@
+    for (std::vector<FunctionInfo*>::const_iterator FLI =
+         FII->getValue().begin(); FLI != FII->getValue().end(); ++FLI) {
+      FunctionInfo *FI = *FLI;
----------------
for-range loop?

================
Comment at: lib/Bitcode/Writer/BitcodeWriter.cpp:2994
@@ +2993,3 @@
+  Stream.Emit(0xE, 4);
+  Stream.Emit(0xD, 4);
+
----------------
(Side note) I read this so many times, I'm surprised there is not a better refactoring for reading/writing bitcode.

================
Comment at: lib/IR/FunctionInfo.cpp:19
@@ +18,3 @@
+
+uint64_t FunctionInfoIndex::NextModuleId = 0;
+
----------------
A mutable global is really to be avoided. I am really not convinced that it is desirable here.
Should it be passed by argument to the `mergeFrom` method? Or a member variable? 

Maybe even `mergeFrom()` should not be a member function, but a separate object that would have a `NextModuleId` and you could add `FunctionInfoIndex` and get the merged result.

================
Comment at: lib/IR/FunctionInfo.cpp:23
@@ +22,3 @@
+// per-module instances.
+void FunctionInfoIndex::mergeFrom(FunctionInfoIndex *Other) {
+
----------------
What about taking a `unique_ptr` here and "stealing" its content instead of doing a bunch of redundant memory allocation and copy?

================
Comment at: lib/IR/FunctionInfo.cpp:29
@@ +28,3 @@
+  StringRef ModPath;
+  for (; OtherFuncInfoLists != Other->funcinfo_end(); OtherFuncInfoLists++) {
+    StringRef FuncName = OtherFuncInfoLists->getKey();
----------------
This loop header is strange with the iterator declared outside. Turn into a for-range loop?

================
Comment at: lib/Object/FunctionIndexObjectFile.cpp:89
@@ +88,3 @@
+ErrorOr<std::unique_ptr<FunctionIndexObjectFile>>
+llvm::object::FunctionIndexObjectFile::create(MemoryBufferRef Object,
+                                              LLVMContext &Context,
----------------
I'm not sure why you would need to provide the fully qualified name for this function?

================
Comment at: test/Bitcode/thinlto-function-summary.ll:17
@@ +16,3 @@
+
+; CHECK: define i32 @foo()
+
----------------
This line is ignored

================
Comment at: test/Bitcode/thinlto-function-summary.ll:25
@@ +24,3 @@
+
+; CHECK: define i32 @bar(i32 %x)
+
----------------
This line is ignored as well

================
Comment at: test/tools/gold/X86/thinlto.ll:8
@@ +7,3 @@
+; RUN: llvm-bcanalyzer -dump %t3.thinlto.bc | FileCheck %s --check-prefix=COMBINED
+; RUN: not test -a %t3
+
----------------
What is this line doing? Isn't the `-a` for the `&&` operator?
Do you check that the file %t3 is not generated by gold and only the combine index is?


http://reviews.llvm.org/D13107





More information about the llvm-commits mailing list