[PATCH] D17028: [ThinLTO] Use MD5 hash in function index.

Wed Feb 10 09:38:55 PST 2016

tejohnson added inline comments.

================
Comment at: lib/IR/Module.cpp:50
@@ -49,3 +49,3 @@
 Module::Module(StringRef MID, LLVMContext &C)
-    : Context(C), Materializer(), ModuleID(MID), DL("") {
+    : Context(C), Materializer(), ModuleID(MID), SourceFileName(MID), DL("") {
   ValSymTab = new ValueSymbolTable();
----------------
joker.eph wrote:
> Not if I was clear because I think what I suggested matches your behavior (unless I missed something).
> 
> I'd have left the `SourceFileName` member empty and change the getter to:
> 
> ```
> const std::string &getSourceFileName() const { 
>   if (SourceFileName.empty()) return getModuleIdentifier();
>   return SourceFileName;
> }
> ```
> 
> But it is not important so feel free to keep as you did.
Oh got it, I misunderstood. It will be more efficient to initialize SourceFileName once and not have to do the empty check every time we call this, so I think I will leave it as-is.

================
Comment at: lib/Transforms/IPO/FunctionImport.cpp:221
@@ -219,1 +220,3 @@
+      // source file name prepended for functions that were originally local
+      // in the source module. Strip any prepended name to recover the original
       // name in the source module.
----------------
joker.eph wrote:
> We can't rename before materializing?
I was worried about doing additional parsing after we have done some renaming. But thinking through this some more it should be ok as all GV references should be through the centralized ValueList which would contain value handles to the renamed GVs.

In any case, there isn't a big need to do the renaming in the SrcModule before the actual importing.

================
Comment at: test/tools/gold/X86/thinlto.ll:27-28
@@ -26,4 +26,4 @@
 ; COMBINED-NEXT: <VALUE_SYMTAB
-; COMBINED-NEXT: <COMBINED_FNENTRY {{.*}} record string = '{{f|g}}'
-; COMBINED-NEXT: <COMBINED_FNENTRY {{.*}} record string = '{{f|g}}'
+; COMBINED-NEXT: <COMBINED_FNENTRY abbrevid={{[0-9]+}} op0={{[0-9]+}} op1={{-3706093650706652785|-5300342847281564238}}
+; COMBINED-NEXT: <COMBINED_FNENTRY abbrevid={{[0-9]+}} op0={{[0-9]+}} op1={{-3706093650706652785|-5300342847281564238}}
 ; COMBINED-NEXT: </VALUE_SYMTAB
----------------
joker.eph wrote:
> What is the op1 here? Is it the 64 low-bits of the MD5 as an int?
> Can we have a better pretty-print from llvm-bcanalyzer? (don't hold this patch for this though)
Yes, op0 is the bitcode offset of the summary, and op0 is the MD5 as an int. I will add a comment to these tests to enumerate the format before committing.

Would you rather see the values printed in hex? The bcanalyzer just prints all record values as ints, it doesn't look at the record type to determine the best format. That would require more plumbing in the bcanalyzer than I'm thinking this is worth. Probably the easiest thing would be to add an option to llvm-bcanalyzer to dump all record values in hex.


http://reviews.llvm.org/D17028