[llvm] r287710 - Before sample pgo annotation, do not inline a function that has no debug info. (NFC)

David Blaikie via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 29 13:31:39 PST 2016


On Tue, Nov 29, 2016 at 1:19 PM Dehao Chen <dehao at google.com> wrote:

> On Tue, Nov 29, 2016 at 11:59 AM, David Blaikie <dblaikie at gmail.com>
> wrote:
>
>
>
> On Tue, Nov 29, 2016 at 11:36 AM Dehao Chen <dehao at google.com> wrote:
>
> On Tue, Nov 29, 2016 at 10:55 AM, David Blaikie <dblaikie at gmail.com>
> wrote:
>
>
>
> On Mon, Nov 28, 2016 at 2:47 PM Dehao Chen <dehao at google.com> wrote:
>
> On Mon, Nov 28, 2016 at 2:32 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Mon, Nov 28, 2016 at 2:18 PM Dehao Chen <dehao at google.com> wrote:
>
> I thought NFC applies to "obvious change to fix corner-case bugs", and
> apparently I was wrong and I'm sorry about that.
>
>
> Ah, right - NFC means No Functional Change. We use this to tag pure
> refactoring changes that do not change the observable behavior of the
> component in question. (eg: changing the API of SmallVector might not
> change the observable behavior of LLVM, but it probably still wouldn't be
> tagged NFC, because we tend to unit test ADTs directly - but changing some
> pass so that it still optimizes in the same way, but doesn't, say, build
> some intermediate data structure anymore, would be NFC because at the level
> we test that pass, nothing has changed)
>
>
> What's the recommend steps to move forward for situations like this? Shall
> I revert the patch and send out a code review?
>
>
> Doesn't seem like there's anything needed to be done - just an FYI of
> don't use 'NFC' when semantics/behavior do change. You included what
> appears to be appropriate test coverage, etc, so no worries there.
>
>
> Thanks for the explanation.
>
>
>
>
> On Mon, Nov 28, 2016 at 10:04 AM, David Blaikie <dblaikie at gmail.com>
> wrote:
>
>
>
> On Tue, Nov 22, 2016 at 2:59 PM Dehao Chen via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
>
> Author: dehao
> Date: Tue Nov 22 16:50:01 2016
> New Revision: 287710
>
> URL: http://llvm.org/viewvc/llvm-project?rev=287710&view=rev
> Log:
> Before sample pgo annotation, do not inline a function that has no debug
> info. (NFC)
>
>
> (+1 to Paul's comment - this sounds like a change, has test changes, and
> is probably not NFC)
>
> Also: This sounds like it does the opposite of what we want as a
> requirement: that the presence or absence of debug info should not change
> the resulting code.
>
> What's the deal?
>
>
> This particular pass requires debug info to make decisions.
>
>
> Sorry, clearly not my area of understanding/expertise - could you give a
> brief description of why/how that is?
>
>
> Sample profile is represented as a map from Debug Loc to sample count. The
> SampleProfileLoader pass reads in the profile, and then traverse the IR to
> read its debug loc and try to find a match from the map. If a match is
> found, then the count is used to annotate the IR. (The actual process is
> more complication, but this is the basic idea).
>
> So this pass requires that all IR has DebugLoc available, otherwise it
> cannot find a match in the profile. As a result, we enforce debug loc to be
> available before this pass when building with -fprofile-sample-use
>
>
> How does this relate to the choice of inlining, though? & if the algorithm
> has trouble with functions containing unassociated debug info (which I'm
> suspecting is where the inf loop came in? not sure) that could be a problem
> that might come up if it's fed more 'interesting' IR, even without making
> this inlining choice.
>
>
> The inlining heuristic before sample pgo annotation is: if a callsite is
> inlined in the profiling binary, and the total sample of the inlined callee
> exceeds a threshold, we will inline the callsite before annotation.
>
> The inf loop problem is, when the callee has no debug info, when inlining
> callee, all it's IR will inherit the debug info of the callsite (which is
> marked as should_inline, as it satisfied the above condition). If there is
> a recursive call in the callee, then this callsite will always inherit the
> debug info of that hot-inlined-callsite, and keep getting inlined. So the
> fix here is, we not only check if a callsite is inlined & hot in profiling
> binary, but also check if the callee has profile: if not, we will not
> inlining it because inlining it does not help annotation as no debug info
> is available to annotation.
>
>
> Hrm.
>
> It seems (admittedly naively) to me that something is a bit confused if
> that's how it all happens.
>
> If the original profile did have the function inlined, and attributed all
> the hits in the inlined code to the call site location - that's already
> potentially erroneous, right? If the called function has loops,
> conditionals, etc, then it's not all equally hot but appears that way in
> the profile.
>
>
>
I think I'm still missing things - but don't want to add more noise/waste
your time, so I'll leave this & perhaps chat about it in person or on IRC,
etc. I do appreciate you taking the time you have to help me understand
what's going on here.

- Dave


> I suppose the question is about general inlining case, not the
> no-debug-info case.
>
> If a callsite is cold, and the callee is hot due to loops etc, we will
> inline the callsite so that it may expose further optimization
> opportunities to the hot callee. This is hot-callee heuristic is also used
> by instr-based PGO.
>
>
> Should discriminators catch/diagnose that?
>
> If the profile was accurate, and it really was that hot all the way down,
> then the inlining would be reasonable/correct. (& would eventually
> terminate if modeled correctly - because the recursive call would
> eventually be not hot enough, right?)
>
>
> I suppose this goes back to the no-debug info case.
>
> The inlining happens before annotation. The purpose is to make the IR
> resemble the profile. So at this stage, profile is only used to find hot
> inline instance (not hot callsite) to inline. Scaling of count does not
> happen here because profile is not annotated yet. So the problem, as Paul
> pointed out, is that without debug info, the recursive callsite will have
> incorrect DebugLoc and will be mistakenly marked as *should_inline*.
>
> Dehao
>
>
>
> *puzzles*
>
> - Dave
>
>
>
> Dehao
>
>
>
> Perhaps you could explain further how the inf loop comes up/how this
> addresses it?
>
> (& more out of curiosity, how the inlining choices are made, why they're
> important here)
>
>
> Hopeful above explained the inline choice here. The reason it is important
> is:
>
> 1. it makes profile annotation natural and easy
> 2. it reserves profile info for all copies of hot code (context-sensitive
> profile). E.g. if foo is inlined into different callers with different
> behavior, all these behavior will be reserved in profile instead of merged
> into one profile.
>
>
>
> This is mandated by pass manager: even with -g0, PM will make sure debug
> info is available before this pass.
>
>
> The PM doesn't produce debug info, though - the client building IR (Clang,
> etc) does.
>
>
> Right, it's the PM, but ParseCodeGenArgs (in frontend) that enforces debug
> info:
>
>   // If the user requested to use a sample profile for PGO, then the
>   // backend will need to track source location information so the profile
>   // can be incorporated into the IR.
>   if (!Opts.SampleProfileFile.empty())
>     NeedLocTracking = true;
>
>
>
> But if user manually removed the debug info, the code should also handle
> it correctly, which is fixed by this patch.
>
> Dehao
>
>
>
>
>
> If there is no debug info in the callee, inlining it will not help
> annotator. This avoids infinite loop as reported in PR/31119.
>
> Added:
>     llvm/trunk/test/Transforms/SampleProfile/Inputs/nodebug.prof
>     llvm/trunk/test/Transforms/SampleProfile/nodebug.ll
> Modified:
>     llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp
>     llvm/trunk/test/Transforms/SampleProfile/early-inline.ll
>
> Modified: llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp?rev=287710&r1=287709&r2=287710&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp (original)
> +++ llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp Tue Nov 22 16:50:01
> 2016
> @@ -651,6 +651,8 @@ bool SampleProfileLoader::inlineHotFunct
>        InvokeInst *II = dyn_cast<InvokeInst>(I);
>        Function *CalledFunction =
>            (CI == nullptr ? II->getCalledFunction() :
> CI->getCalledFunction());
> +      if (!CalledFunction || !CalledFunction->getSubprogram())
> +        continue;
>        DebugLoc DLoc = I->getDebugLoc();
>        uint64_t NumSamples =
> findCalleeFunctionSamples(*I)->getTotalSamples();
>        if ((CI && InlineFunction(CI, IFI)) || (II && InlineFunction(II,
> IFI))) {
>
> Added: llvm/trunk/test/Transforms/SampleProfile/Inputs/nodebug.prof
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/Inputs/nodebug.prof?rev=287710&view=auto
>
> ==============================================================================
> --- llvm/trunk/test/Transforms/SampleProfile/Inputs/nodebug.prof (added)
> +++ llvm/trunk/test/Transforms/SampleProfile/Inputs/nodebug.prof Tue Nov
> 22 16:50:01 2016
> @@ -0,0 +1,2 @@
> +foo:100:10
> + 0: bar:10
>
> Modified: llvm/trunk/test/Transforms/SampleProfile/early-inline.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/early-inline.ll?rev=287710&r1=287709&r2=287710&view=diff
>
> ==============================================================================
> --- llvm/trunk/test/Transforms/SampleProfile/early-inline.ll (original)
> +++ llvm/trunk/test/Transforms/SampleProfile/early-inline.ll Tue Nov 22
> 16:50:01 2016
> @@ -28,7 +28,7 @@ define void @_Z3foov() #0 personality i8
>  }
>
>  ; Function Attrs: nounwind uwtable
> -define internal void @_ZL3barv() #1 {
> +define internal void @_ZL3barv() !dbg !12 {
>    ret void
>  }
>
> @@ -45,3 +45,4 @@ declare i32 @__gxx_personality_v0(...)
>  !9 = !DILocation(line: 6, column: 3, scope: !6)
>  !10 = !DILocation(line: 8, column: 5, scope: !11)
>  !11 = distinct !DILexicalBlock(scope: !6, file: !1, line: 7, column: 7)
> +!12 = distinct !DISubprogram(linkageName: "_ZL3barv", scope: !1, line:
> 20, scopeLine: 20, unit: !0)
>
> Added: llvm/trunk/test/Transforms/SampleProfile/nodebug.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/nodebug.ll?rev=287710&view=auto
>
> ==============================================================================
> --- llvm/trunk/test/Transforms/SampleProfile/nodebug.ll (added)
> +++ llvm/trunk/test/Transforms/SampleProfile/nodebug.ll Tue Nov 22
> 16:50:01 2016
> @@ -0,0 +1,20 @@
> +; RUN: opt < %s -sample-profile
> -sample-profile-file=%S/Inputs/nodebug.prof
> +
> +define void @foo() !dbg !3 {
> +  call void @bar(), !dbg !4
> +  ret void
> +}
> +
> +define void @bar() {
> +  call void @bar()
> +  ret void
> +}
> +
> +!llvm.dbg.cu = !{!0}
> +!llvm.module.flags = !{!2}
> +
> +!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1)
> +!1 = !DIFile(filename: "t", directory: "/tmp/")
> +!2 = !{i32 2, !"Debug Info Version", i32 3}
> +!3 = distinct !DISubprogram(name: "a", scope: !1, file: !1, line: 10,
> unit: !0)
> +!4 = !DILocation(line: 10, scope: !3)
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161129/cfc4ce80/attachment-0001.html>


More information about the llvm-commits mailing list