[PATCH] D23488: ThinLTO: add early "dead-stripping" on the Index
    Teresa Johnson via Phabricator via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Mon Dec 19 16:40:07 PST 2016
    
    
  
tejohnson added a comment.
In https://reviews.llvm.org/D23488#627205, @pcc wrote:
> In https://reviews.llvm.org/D23488#627164, @tejohnson wrote:
>
> > In https://reviews.llvm.org/D23488#627154, @pcc wrote:
> >
> > > This information will probably need to be encoded in the summary somehow once we're at the point where we no longer need to load the module during symbol resolution.
> >
> >
> > At that point we will have a separately loadable symbol table in the bitcode file, but there again I guess we will not know anything about these llvm.* symbols in the table, right?
>
>
> I was envisioning that the live bits in the global summary wouldn't be applied to the llvm.* symbols themselves but to the globals referenced by them.
Either one works for the index-based dead stripping. But my point was we need something to flag this in the summary because these symbols won't be in the bitcode symbol table (essentially the situation we're running into here).
> 
> 
>>> I think at this point we just need to collect a set of GC roots in add{Regular,Thin}LTO in the same way that we collect the set of used variables with collectUsedGlobalVariables.
>> 
>> Presumably using a new flag added to the summary, right? Or did you have a different approach in mind for now?
> 
> Yes, it could be done in the summary. I was also thinking you could have a function called something like collectGCRoots that does something like this:
> 
>   void collectGCRoots(Module *M, SmallSet<GlobalValue *> &GCRoots) {
>     collectUsedGlobalVariables(M, GCRoots, true);
>     collectUsedGlobalVariables(M, GCRoots, false);
>     // similarly for llvm.global_ctors and llvm.global_dtors
>   }
> 
> 
> then use GCRoots to decide whether to set the bit in GlobalResolution. But now that I think about it, I think I prefer the approach of setting live bits in the summary, for example it would seem to make it much simpler to pull the live lists of type identifiers out of the combined summary for CFI/devirtualization.
We don't want to parse the module and invoke collectUsedGlobalVariables when we do the thin link, which is where we need this info. Oh - I see that we are currently doing this. ISTR that this is only temporary (I hope - we don't want to have to parse the IR in the ThinLink!). So it would have to be done during the compile step (where we could do what you suggest above). And then we put it in the summary. Although like I mentioned above, we can just flag the llvm.* variables as the live roots, that's simpler than walking all of their references and flagging them.
> Also think about how this interacts with regular LTO. You probably also want anything referenced from the regular LTO module to be treated as a GC root.
Good point - I think we would miss those right now with this patch.
> 
> 
>> Right now the patch uses the set of preserved GUIDs computed from the GlobalResolution info as the roots in runThinLTO (see invocation of computeDeadSymbols). So probably we would just use the summary flag in computeDeadSymbols to add to the set of preserved GUIDs passed in. I.e. not in addThinLTO (this is for index-based internalization/dead stripping, so I don't think regular LTO comes into play).
> 
> Yes, or change the computeDeadSymbols function to be implemented only in terms of the live bits. Then adding the GC roots from the linker (i.e. symbols referenced from regular objects or a regular LTO module) would be implemented by setting live bits on the roots.
https://reviews.llvm.org/D23488
    
    
More information about the llvm-commits
mailing list