[LLVMdev] Removing AvailableExternal values in GlobalDCE (was Re: RFC: ThinLTO Impementation Plan)

Teresa Johnson tejohnson at google.com
Tue May 19 13:05:36 PDT 2015


On Tue, May 19, 2015 at 12:41 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
> On Mon, May 18, 2015 at 9:09 PM, Teresa Johnson <tejohnson at google.com>
> wrote:
>>
>> On Fri, May 15, 2015 at 11:20 PM, Teresa Johnson <tejohnson at google.com>
>> wrote:
>> > On Fri, May 15, 2015 at 2:52 PM, David Blaikie <dblaikie at gmail.com>
>> > wrote:
>> >>
>> >>
>> >> On Fri, May 15, 2015 at 1:15 PM, Teresa Johnson <tejohnson at google.com>
>> >> wrote:
>> >>>
>> >>> On Fri, May 15, 2015 at 10:04 AM, David Blaikie <dblaikie at gmail.com>
>> >>> wrote:
>> >>> >
>> >>> >
>> >>> > On Fri, May 15, 2015 at 9:53 AM, Teresa Johnson
>> >>> > <tejohnson at google.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> Looking back through my GlobalDCE changes, it looks like one of the
>> >>> >> places I had changed (where we mark defined globals in runOnModule)
>> >>> >> already has a guard for !hasAvailableExternallyLinkage and
>> >>> >> !isDiscardableIfUnused, so my additional guard against marking
>> >>> >> imported functions is unnecessary. But the other place I had to
>> >>> >> change
>> >>> >> was in GlobalIsNeeded where it walks through the function and
>> >>> >> recursively marks any referenced global as needed. Here there was
>> >>> >> no
>> >>> >> guard against marking a global that is available externally as
>> >>> >> needed
>> >>> >> if it is referenced. I had added a check here to not mark imported
>> >>> >> functions as needed on reference unless they were discardable (i.e.
>> >>> >> link once). Is this a bug - should this have a guard against
>> >>> >> marking
>> >>> >> available externally function refs as needed?
>> >>> >
>> >>> >
>> >>> > Duncan's probably got a better idea of what the right answer is
>> >>> > here. I
>> >>> > suspect "yes".
>> >>> >
>> >>> > The trick with available_externally is to ensure we keep these
>> >>> > around
>> >>> > for
>> >>> > long enough that their definitions are useful (for inlining,
>> >>> > constant
>> >>> > prop,
>> >>> > all that good stuff) but remove them before we actually do too much
>> >>> > work
>> >>> > on
>> >>> > them, or at least before we emit them into the final object file.
>> >>>
>> >>> Yep, and that is exactly how long I want my imported functions to
>> >>> stick around. Currently I have the ThinLTO import pass added at the
>> >>> start of addLTOOptimizationPasses, just after the AA passes. So the
>> >>> imported functions stick around through global opt, constant merge,
>> >>> combining, and inlining. The call to GlobalDCE is just after inlining
>> >>> and a second globalopt.
>> >>>
>> >>> >
>> >>> > I imagine if GlobalDCE isn't removing available_externally functions
>> >>> > it's
>> >>> > because they're still useful in the optimization pipeline and
>> >>> > something
>> >>> > further down the pipe removes them (because they do, ultimately, get
>> >>> > removed).
>> >>>
>> >>> That would be good to know, since if they are useful later on in the
>> >>> pipe then presumably my imported functions could be as well. But I
>> >>> wonder who is removing them since GlobalDCE isn't (if there are refs)?
>> >>
>> >>
>> >> Yep - seems no one removes them (well, probably if there are no calls
>> >> to the
>> >> function at all they get removed as usual (eg: if all the calls are
>> >> inlined
>> >> we probably drop the function as we would for a linkonce_odr
>> >> function)). If
>> >> they're still around come codegen time, we just skip them:
>> >>
>> >> lib/CodeGen/MachineFunctionPass.cpp,
>> >> MachineFunctionPass::runOnFunction:
>> >>
>> >> 33| bool MachineFunctionPass::runOnFunction(Function &F) {
>> >> 34|   // Do not codegen any 'available_externally' functions at all,
>> >> they
>> >> have
>> >> 35|   // definitions outside the translation unit.
>> >> 36|   if (F.hasAvailableExternallyLinkage())
>> >> 37|     return false;
>> >> 38|
>> >> 39|   MachineFunction &MF =
>> >> getAnalysis<MachineFunctionAnalysis>().getMF();
>> >> 40|   return runOnMachineFunction(MF);
>> >> 41| }
>> >
>> > Ok, thanks for digging this up. I guess that would work for the
>> > imported functions as well, but it seems like a waste of compile time
>> > if they are not useful after inlining/global opt. Is there any reason
>> > to keep them past GlobalDCE or should I try changing GlobalDCE so that
>> > it removes them?
>> >
>>
>> I found allowing GlobalDCE to remove referenced
>> AvailableExternallyLinkage values earlier passes all the clang/llvm
>> tests (see patch below).
>
>
> The LLVM regression tests might not catch the sort of regression this change
> could cause. Usually we test each pass in isolation and tend towards white
> box testing (so we test the cases that are "interesting" according to the
> algorithm) - in this case, you're adding a condition (adding an
> "interesting" case that wasn't interesting before - so didn't need to be
> tested explicitly) that changes the behavior of GlobalDCE. This behavior
> may've been depended upon by further down-stream optimizations for overall
> performance.
>
>>
>> AFAICT, it shouldn't be necessary to keep
>> these functions around past the first call to GlobalDCE (which is
>> after inlining), but let me know if I missed something.
>
>
> It's possible they're still useful for optimizations other than inlining. We
> can still do constant propagation through such functions (hey, look, this
> function always returns 3, etc).

I thought this was done by passes such as IPSCCPPass and GlobalOpt
which happen earlier.

>
> It might be necessary/worth actually running perf numbers on such a change
> (or manually (possibly by asking those who know - not me, unfortunately)
> checking the passes that come after GlobalDCE to see if any would do cross
> function constant prop, etc. Perhaps someone else knows this off-the-cuff.

I can collect some perf data.

>
>>
>> Although they
>> will be removed eventually by MachineFunctionPass as David pointed
>> out,
>
>
> Interestingly they're not exactly /removed/, they're just not emitted. I
> don't know whether that's important.

Yes, true, suppressed not removed.

>
>>
>> so this is more of a compile-time improvement than anything. It
>> will affect ThinLTO more significantly in compile time because there
>> will be more functions with this linkage type, depending on how many
>> static functions
>
>
> Hmm - why static functions in particular? I imagine this attribute would be
> used for non-static functions too in ThinLTO.

Arg, yes, I made the same mistake in some earlier email - all "global"
functions (originally so and those static promoted) will be
available_externally. So this is a significant amount for ThinLTO.

>
>>
>> are imported that we have to promote.
>>
>>
>> Also, what about available_externally variable
>> definitions/initializations? I'm not sure what source construct
>> currently can generate those,
>
>
> I believe we emit C++ vtables as available_externally when the type has a
> key function. This allows devirtualization opportunities without bloating
> object file size with duplicate vtable data.
>
>>
>> but are these suppressed/turned into
>> declarations at some point?
>
>
> Maybe? Again, I would guess (and you can check some of this with
> -print-after-all to see how the IR changes as it goes through the
> optimizations) they might just be omitted from the generated code, not
> necessarily removed from the IR at any point.

Ok, I will investigate.

Thanks,
Teresa

>
>>
>> Note that ThinLTO imported file static
>> variables always need to be promoted, and will be marked
>> available_external.
>>
>> Index: lib/Transforms/IPO/GlobalDCE.cpp
>> ===================================================================
>> --- lib/Transforms/IPO/GlobalDCE.cpp (revision 237590)
>> +++ lib/Transforms/IPO/GlobalDCE.cpp (working copy)
>> @@ -162,7 +162,10 @@ bool GlobalDCE::runOnModule(Module &M) {
>>      // themselves.
>>      for (unsigned i = 0, e = DeadFunctions.size(); i != e; ++i) {
>>        RemoveUnusedGlobalValue(*DeadFunctions[i]);
>> -      M.getFunctionList().erase(DeadFunctions[i]);
>> +      // Might have deleted the body of an available externally function
>> that
>> +      // is still referenced. Leave the declaration.
>> +      if (DeadFunctions[i]->use_empty())
>> +        M.getFunctionList().erase(DeadFunctions[i]);
>>      }
>>      NumFunctions += DeadFunctions.size();
>>      Changed = true;
>> @@ -171,7 +174,10 @@ bool GlobalDCE::runOnModule(Module &M) {
>>    if (!DeadGlobalVars.empty()) {
>>      for (unsigned i = 0, e = DeadGlobalVars.size(); i != e; ++i) {
>>        RemoveUnusedGlobalValue(*DeadGlobalVars[i]);
>> -      M.getGlobalList().erase(DeadGlobalVars[i]);
>> +      // Might have deleted the definition of an available externally
>> function
>> +      // that is still referenced. Leave the declaration.
>> +      if (DeadGlobalVars[i]->use_empty())
>> +        M.getGlobalList().erase(DeadGlobalVars[i]);
>>      }
>>      NumVariables += DeadGlobalVars.size();
>>      Changed = true;
>> @@ -231,8 +237,10 @@ void GlobalDCE::GlobalIsNeeded(GlobalValue *G) {
>>      for (Function::iterator BB = F->begin(), E = F->end(); BB != E; ++BB)
>>        for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E;
>> ++I)
>>          for (User::op_iterator U = I->op_begin(), E = I->op_end(); U !=
>> E; ++U)
>> -          if (GlobalValue *GV = dyn_cast<GlobalValue>(*U))
>> -            GlobalIsNeeded(GV);
>> +          if (GlobalValue *GV = dyn_cast<GlobalValue>(*U)) {
>> +            if (!GV->hasAvailableExternallyLinkage())
>> +              GlobalIsNeeded(GV);
>> +          }
>>            else if (Constant *C = dyn_cast<Constant>(*U))
>>              MarkUsedGlobalsAsNeeded(C);
>>    }
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
>
>



-- 
Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413



More information about the llvm-dev mailing list