[PATCH] D21915: ThinLTO: Add -thinlto-action=promote-internalize.

Mehdi AMINI via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 6 13:48:29 PDT 2016


mehdi_amini added inline comments.

================
Comment at: tools/llvm-lto/llvm-lto.cpp:100
@@ +99,3 @@
+        clEnumValN(
+            THINPROMOTEINTERNALIZE, "promote-internalize",
+            "Perform promotion and internalization (requires -thinlto-index)."),
----------------
pcc wrote:
> tejohnson wrote:
> > pcc wrote:
> > > tejohnson wrote:
> > > > Rather than add a new type here, can the test just do two invocations of llvm-lto - one invoking promotion and then feeding the resulting bitcode into another invoking internalization?
> > > That's what I tried first of all, but I got assertion failures like this:
> > > 
> > > ```llvm-lto: ../lib/Transforms/IPO/FunctionImport.cpp:536: auto llvm::thinLTOInternalizeModule(llvm::Module &, const GVSummaryMapTy &)::(anonymous class)::operator()(const llvm::GlobalValue &) const: Assertion `GS != DefinedGlobals.end()' failed.```
> > > 
> > > I think this is caused by the internalization code looking up symbol names by GUID, which will be different as a result of the module identifier being different if the input bitcode is piped in. (I thought the whole point of the source_filename thing was to try to protect against that sort of problem, but evidently it doesn't help here -- even if I apply a diff like this:
> > > ```
> > > diff --git a/test/ThinLTO/X86/Inputs/alias_import.ll b/test/ThinLTO/X86/Inputs/alias_import.ll
> > > index 36e5ad1..5f1324b 100644
> > > --- a/test/ThinLTO/X86/Inputs/alias_import.ll
> > > +++ b/test/ThinLTO/X86/Inputs/alias_import.ll
> > > @@ -1,6 +1,4 @@
> > > -
> > > -
> > > -
> > > +source_filename = "2.c"
> > >  
> > >  @globalfuncAlias = alias void (...), bitcast (void ()* @globalfunc to void (...)*)
> > >  @globalfuncWeakAlias = weak alias void (...), bitcast (void ()* @globalfunc to void (...)*)
> > > diff --git a/test/ThinLTO/X86/alias_import.ll b/test/ThinLTO/X86/alias_import.ll
> > > index d4cc996..c7ffc12 100644
> > > --- a/test/ThinLTO/X86/alias_import.ll
> > > +++ b/test/ThinLTO/X86/alias_import.ll
> > > @@ -2,7 +2,7 @@
> > >  ; RUN: opt -module-summary %p/Inputs/alias_import.ll -o %t2.bc
> > >  ; RUN: llvm-lto -thinlto-action=thinlink -o %t.index.bc %t1.bc %t2.bc
> > >  ; RUN: llvm-lto -thinlto-action=promote -thinlto-index %t.index.bc %t2.bc -o - | llvm-dis -o - | FileCheck %s --check-prefix=PROMOTE
> > > -; RUN: llvm-lto -thinlto-action=promote-internalize -thinlto-index %t.index.bc %t2.bc -o - | llvm-dis -o - | FileCheck %s --check-prefix=PROMOTE-INTERNALIZE
> > > +; RUN: llvm-lto -thinlto-action=promote -thinlto-index %t.index.bc %t2.bc -o - | llvm-lto -thinlto-action=internalize -thinlto-index %t.index.bc -exported-symbol=dummy - -o - | llvm-dis -o - | FileCheck %s --check-prefix=PROMOTE-INTERNALIZE
> > >  ; RUN: llvm-lto -thinlto-action=import -thinlto-index %t.index.bc %t1.bc -o - | llvm-dis -o - | FileCheck %s --check-prefix=IMPORT
> > >  
> > >  ;
> > > @@ -126,6 +126,8 @@
> > >  ; PROMOTE-INTERNALIZE-DAG: define internal void @linkoncefunc()
> > >  ; PROMOTE-INTERNALIZE-DAG: define internal void @weakfunc()
> > >  
> > > +source_filename = "1.c"
> > > +
> > >  define i32 @main() #0 {
> > >  entry:
> > >    call void @globalfuncAlias()
> > > ```
> > > 
> > >  the combined summary still looks like this here:
> > > 
> > > ```
> > >   <ENTRY [...] /> record string = '[...]/Output/alias_import.ll.tmp1.bc'
> > >   <ENTRY [...] /> record string = '[...]/Output/alias_import.ll.tmp2.bc
> > > ```
> > > )
> > The name in the combined summary is the path to the bitcode (used to locate a function during importing). 
> > 
> > However, setting the source_filename in the LLVM assembly should cause the right GUID naming to occur, I'm not sure offhand why it isn't enabling consistent GUID computation. The code in thinLTOInternalizeModule does attempt to get the original name for something that has already been promoted. Can you look at the intermediate bitcode files and see if they contain a SOURCE_FILENAME record?
> > The name in the combined summary is the path to the bitcode (used to locate a function during importing).
> 
> Ah, yes, of course.
> 
> > However, setting the source_filename in the LLVM assembly should cause the right GUID naming to occur, I'm not sure offhand why it isn't enabling consistent GUID computation.
> 
> Okay, here's what's going on here.
> 
> - The client is using `ModuleSummaryIndex::collectDefinedGVSummariesPerModule` to group global value summaries by summary module path [1]. 
> - We then use the module identifier as a key to look up the global value summary map entry [2].
> - That lookup yields an empty map because the module identifier does not match any summary module path. As a result, none of the lookups in `thinLTOInternalizeModule` succeed.
> 
> I see two possible solutions here:
> - do what I'm doing in this patch
> - provide some way to override the module identifier in the loaded module in order to allow the lookup to succeed
> I'm partial to the former because it's more in line with what I see as the end goal here (combining the internalize and promote phases). I don't think we should extend the module summary format (for example by allowing a source filename to be used as a key) for the same reason. Let me know what you think.
> 
> [1] http://llvm-cs.pcc.me.uk/lib/IR/ModuleSummaryIndex.cpp#88
> [2] http://llvm-cs.pcc.me.uk/lib/LTO/ThinLTOCodeGenerator.cpp#663
Could the option "-thinlto-module-id" to llvm-lto help here?


http://reviews.llvm.org/D21915





More information about the llvm-commits mailing list