<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, May 19, 2015 at 1:05 PM, Teresa Johnson <span dir="ltr"><<a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On Tue, May 19, 2015 at 12:41 PM, David Blaikie <<a href="mailto:dblaikie@gmail.com">dblaikie@gmail.com</a>> wrote:<br>
><br>
><br>
> On Mon, May 18, 2015 at 9:09 PM, Teresa Johnson <<a href="mailto:tejohnson@google.com">tejohnson@google.com</a>><br>
> wrote:<br>
>><br>
>> On Fri, May 15, 2015 at 11:20 PM, Teresa Johnson <<a href="mailto:tejohnson@google.com">tejohnson@google.com</a>><br>
>> wrote:<br>
>> > On Fri, May 15, 2015 at 2:52 PM, David Blaikie <<a href="mailto:dblaikie@gmail.com">dblaikie@gmail.com</a>><br>
>> > wrote:<br>
>> >><br>
>> >><br>
>> >> On Fri, May 15, 2015 at 1:15 PM, Teresa Johnson <<a href="mailto:tejohnson@google.com">tejohnson@google.com</a>><br>
>> >> wrote:<br>
>> >>><br>
>> >>> On Fri, May 15, 2015 at 10:04 AM, David Blaikie <<a href="mailto:dblaikie@gmail.com">dblaikie@gmail.com</a>><br>
>> >>> wrote:<br>
>> >>> ><br>
>> >>> ><br>
>> >>> > On Fri, May 15, 2015 at 9:53 AM, Teresa Johnson<br>
>> >>> > <<a href="mailto:tejohnson@google.com">tejohnson@google.com</a>><br>
>> >>> > wrote:<br>
>> >>> >><br>
>> >>> >> Looking back through my GlobalDCE changes, it looks like one of the<br>
>> >>> >> places I had changed (where we mark defined globals in runOnModule)<br>
>> >>> >> already has a guard for !hasAvailableExternallyLinkage and<br>
>> >>> >> !isDiscardableIfUnused, so my additional guard against marking<br>
>> >>> >> imported functions is unnecessary. But the other place I had to<br>
>> >>> >> change<br>
>> >>> >> was in GlobalIsNeeded where it walks through the function and<br>
>> >>> >> recursively marks any referenced global as needed. Here there was<br>
>> >>> >> no<br>
>> >>> >> guard against marking a global that is available externally as<br>
>> >>> >> needed<br>
>> >>> >> if it is referenced. I had added a check here to not mark imported<br>
>> >>> >> functions as needed on reference unless they were discardable (i.e.<br>
>> >>> >> link once). Is this a bug - should this have a guard against<br>
>> >>> >> marking<br>
>> >>> >> available externally function refs as needed?<br>
>> >>> ><br>
>> >>> ><br>
>> >>> > Duncan's probably got a better idea of what the right answer is<br>
>> >>> > here. I<br>
>> >>> > suspect "yes".<br>
>> >>> ><br>
>> >>> > The trick with available_externally is to ensure we keep these<br>
>> >>> > around<br>
>> >>> > for<br>
>> >>> > long enough that their definitions are useful (for inlining,<br>
>> >>> > constant<br>
>> >>> > prop,<br>
>> >>> > all that good stuff) but remove them before we actually do too much<br>
>> >>> > work<br>
>> >>> > on<br>
>> >>> > them, or at least before we emit them into the final object file.<br>
>> >>><br>
>> >>> Yep, and that is exactly how long I want my imported functions to<br>
>> >>> stick around. Currently I have the ThinLTO import pass added at the<br>
>> >>> start of addLTOOptimizationPasses, just after the AA passes. So the<br>
>> >>> imported functions stick around through global opt, constant merge,<br>
>> >>> combining, and inlining. The call to GlobalDCE is just after inlining<br>
>> >>> and a second globalopt.<br>
>> >>><br>
>> >>> ><br>
>> >>> > I imagine if GlobalDCE isn't removing available_externally functions<br>
>> >>> > it's<br>
>> >>> > because they're still useful in the optimization pipeline and<br>
>> >>> > something<br>
>> >>> > further down the pipe removes them (because they do, ultimately, get<br>
>> >>> > removed).<br>
>> >>><br>
>> >>> That would be good to know, since if they are useful later on in the<br>
>> >>> pipe then presumably my imported functions could be as well. But I<br>
>> >>> wonder who is removing them since GlobalDCE isn't (if there are refs)?<br>
>> >><br>
>> >><br>
>> >> Yep - seems no one removes them (well, probably if there are no calls<br>
>> >> to the<br>
>> >> function at all they get removed as usual (eg: if all the calls are<br>
>> >> inlined<br>
>> >> we probably drop the function as we would for a linkonce_odr<br>
>> >> function)). If<br>
>> >> they're still around come codegen time, we just skip them:<br>
>> >><br>
>> >> lib/CodeGen/MachineFunctionPass.cpp,<br>
>> >> MachineFunctionPass::runOnFunction:<br>
>> >><br>
>> >> 33| bool MachineFunctionPass::runOnFunction(Function &F) {<br>
>> >> 34| // Do not codegen any 'available_externally' functions at all,<br>
>> >> they<br>
>> >> have<br>
>> >> 35| // definitions outside the translation unit.<br>
>> >> 36| if (F.hasAvailableExternallyLinkage())<br>
>> >> 37| return false;<br>
>> >> 38|<br>
>> >> 39| MachineFunction &MF =<br>
>> >> getAnalysis<MachineFunctionAnalysis>().getMF();<br>
>> >> 40| return runOnMachineFunction(MF);<br>
>> >> 41| }<br>
>> ><br>
>> > Ok, thanks for digging this up. I guess that would work for the<br>
>> > imported functions as well, but it seems like a waste of compile time<br>
>> > if they are not useful after inlining/global opt. Is there any reason<br>
>> > to keep them past GlobalDCE or should I try changing GlobalDCE so that<br>
>> > it removes them?<br>
>> ><br>
>><br>
>> I found allowing GlobalDCE to remove referenced<br>
>> AvailableExternallyLinkage values earlier passes all the clang/llvm<br>
>> tests (see patch below).<br>
><br>
><br>
> The LLVM regression tests might not catch the sort of regression this change<br>
> could cause. Usually we test each pass in isolation and tend towards white<br>
> box testing (so we test the cases that are "interesting" according to the<br>
> algorithm) - in this case, you're adding a condition (adding an<br>
> "interesting" case that wasn't interesting before - so didn't need to be<br>
> tested explicitly) that changes the behavior of GlobalDCE. This behavior<br>
> may've been depended upon by further down-stream optimizations for overall<br>
> performance.<br>
><br>
>><br>
>> AFAICT, it shouldn't be necessary to keep<br>
>> these functions around past the first call to GlobalDCE (which is<br>
>> after inlining), but let me know if I missed something.<br>
><br>
><br>
> It's possible they're still useful for optimizations other than inlining. We<br>
> can still do constant propagation through such functions (hey, look, this<br>
> function always returns 3, etc).<br>
<br>
</div></div>I thought this was done by passes such as IPSCCPPass and GlobalOpt<br>
which happen earlier.<br></blockquote><div><br>You might well be right - I know next to nothing about these things. Having someone a double-check/review from someone more familiar would be nice.<br><br>I'll leave the rest of the review to any such someone who might be sufficiently informed about the pass pipeline to decide whether this is the right call. (and/or perf numbers that demonstrate it)<br><br>Hopefully my naive review has at least covered the basics.<br><br>- David<br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<span class=""><br>
><br>
> It might be necessary/worth actually running perf numbers on such a change<br>
> (or manually (possibly by asking those who know - not me, unfortunately)<br>
> checking the passes that come after GlobalDCE to see if any would do cross<br>
> function constant prop, etc. Perhaps someone else knows this off-the-cuff.<br>
<br>
</span>I can collect some perf data.<br>
<span class=""><br>
><br>
>><br>
>> Although they<br>
>> will be removed eventually by MachineFunctionPass as David pointed<br>
>> out,<br>
><br>
><br>
> Interestingly they're not exactly /removed/, they're just not emitted. I<br>
> don't know whether that's important.<br>
<br>
</span>Yes, true, suppressed not removed.<br>
<span class=""><br>
><br>
>><br>
>> so this is more of a compile-time improvement than anything. It<br>
>> will affect ThinLTO more significantly in compile time because there<br>
>> will be more functions with this linkage type, depending on how many<br>
>> static functions<br>
><br>
><br>
> Hmm - why static functions in particular? I imagine this attribute would be<br>
> used for non-static functions too in ThinLTO.<br>
<br>
</span>Arg, yes, I made the same mistake in some earlier email - all "global"<br>
functions (originally so and those static promoted) will be<br>
available_externally. So this is a significant amount for ThinLTO.<br>
<span class=""><br>
><br>
>><br>
>> are imported that we have to promote.<br>
>><br>
>><br>
>> Also, what about available_externally variable<br>
>> definitions/initializations? I'm not sure what source construct<br>
>> currently can generate those,<br>
><br>
><br>
> I believe we emit C++ vtables as available_externally when the type has a<br>
> key function. This allows devirtualization opportunities without bloating<br>
> object file size with duplicate vtable data.<br>
><br>
>><br>
>> but are these suppressed/turned into<br>
>> declarations at some point?<br>
><br>
><br>
> Maybe? Again, I would guess (and you can check some of this with<br>
> -print-after-all to see how the IR changes as it goes through the<br>
> optimizations) they might just be omitted from the generated code, not<br>
> necessarily removed from the IR at any point.<br>
<br>
</span>Ok, I will investigate.<br>
<br>
Thanks,<br>
Teresa<br>
<div class="HOEnZb"><div class="h5"><br>
><br>
>><br>
>> Note that ThinLTO imported file static<br>
>> variables always need to be promoted, and will be marked<br>
>> available_external.<br>
>><br>
>> Index: lib/Transforms/IPO/GlobalDCE.cpp<br>
>> ===================================================================<br>
>> --- lib/Transforms/IPO/GlobalDCE.cpp (revision 237590)<br>
>> +++ lib/Transforms/IPO/GlobalDCE.cpp (working copy)<br>
>> @@ -162,7 +162,10 @@ bool GlobalDCE::runOnModule(Module &M) {<br>
>> // themselves.<br>
>> for (unsigned i = 0, e = DeadFunctions.size(); i != e; ++i) {<br>
>> RemoveUnusedGlobalValue(*DeadFunctions[i]);<br>
>> - M.getFunctionList().erase(DeadFunctions[i]);<br>
>> + // Might have deleted the body of an available externally function<br>
>> that<br>
>> + // is still referenced. Leave the declaration.<br>
>> + if (DeadFunctions[i]->use_empty())<br>
>> + M.getFunctionList().erase(DeadFunctions[i]);<br>
>> }<br>
>> NumFunctions += DeadFunctions.size();<br>
>> Changed = true;<br>
>> @@ -171,7 +174,10 @@ bool GlobalDCE::runOnModule(Module &M) {<br>
>> if (!DeadGlobalVars.empty()) {<br>
>> for (unsigned i = 0, e = DeadGlobalVars.size(); i != e; ++i) {<br>
>> RemoveUnusedGlobalValue(*DeadGlobalVars[i]);<br>
>> - M.getGlobalList().erase(DeadGlobalVars[i]);<br>
>> + // Might have deleted the definition of an available externally<br>
>> function<br>
>> + // that is still referenced. Leave the declaration.<br>
>> + if (DeadGlobalVars[i]->use_empty())<br>
>> + M.getGlobalList().erase(DeadGlobalVars[i]);<br>
>> }<br>
>> NumVariables += DeadGlobalVars.size();<br>
>> Changed = true;<br>
>> @@ -231,8 +237,10 @@ void GlobalDCE::GlobalIsNeeded(GlobalValue *G) {<br>
>> for (Function::iterator BB = F->begin(), E = F->end(); BB != E; ++BB)<br>
>> for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E;<br>
>> ++I)<br>
>> for (User::op_iterator U = I->op_begin(), E = I->op_end(); U !=<br>
>> E; ++U)<br>
>> - if (GlobalValue *GV = dyn_cast<GlobalValue>(*U))<br>
>> - GlobalIsNeeded(GV);<br>
>> + if (GlobalValue *GV = dyn_cast<GlobalValue>(*U)) {<br>
>> + if (!GV->hasAvailableExternallyLinkage())<br>
>> + GlobalIsNeeded(GV);<br>
>> + }<br>
>> else if (Constant *C = dyn_cast<Constant>(*U))<br>
>> MarkUsedGlobalsAsNeeded(C);<br>
>> }<br>
>><br>
>><br>
>><br>
>> --<br>
>> Teresa Johnson | Software Engineer | <a href="mailto:tejohnson@google.com">tejohnson@google.com</a> | <a href="tel:408-460-2413" value="+14084602413">408-460-2413</a><br>
><br>
><br>
<br>
<br>
<br>
--<br>
Teresa Johnson | Software Engineer | <a href="mailto:tejohnson@google.com">tejohnson@google.com</a> | <a href="tel:408-460-2413" value="+14084602413">408-460-2413</a><br>
</div></div></blockquote></div><br></div></div>