[llvm] r287710 - Before sample pgo annotation, do not inline a function that has no debug info. (NFC)

Dehao Chen via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 29 13:19:38 PST 2016


On Tue, Nov 29, 2016 at 11:59 AM, David Blaikie <dblaikie at gmail.com> wrote:

>
>
> On Tue, Nov 29, 2016 at 11:36 AM Dehao Chen <dehao at google.com> wrote:
>
>> On Tue, Nov 29, 2016 at 10:55 AM, David Blaikie <dblaikie at gmail.com>
>> wrote:
>>
>>
>>
>> On Mon, Nov 28, 2016 at 2:47 PM Dehao Chen <dehao at google.com> wrote:
>>
>> On Mon, Nov 28, 2016 at 2:32 PM, David Blaikie <dblaikie at gmail.com>
>> wrote:
>>
>>
>>
>> On Mon, Nov 28, 2016 at 2:18 PM Dehao Chen <dehao at google.com> wrote:
>>
>> I thought NFC applies to "obvious change to fix corner-case bugs", and
>> apparently I was wrong and I'm sorry about that.
>>
>>
>> Ah, right - NFC means No Functional Change. We use this to tag pure
>> refactoring changes that do not change the observable behavior of the
>> component in question. (eg: changing the API of SmallVector might not
>> change the observable behavior of LLVM, but it probably still wouldn't be
>> tagged NFC, because we tend to unit test ADTs directly - but changing some
>> pass so that it still optimizes in the same way, but doesn't, say, build
>> some intermediate data structure anymore, would be NFC because at the level
>> we test that pass, nothing has changed)
>>
>>
>> What's the recommend steps to move forward for situations like this?
>> Shall I revert the patch and send out a code review?
>>
>>
>> Doesn't seem like there's anything needed to be done - just an FYI of
>> don't use 'NFC' when semantics/behavior do change. You included what
>> appears to be appropriate test coverage, etc, so no worries there.
>>
>>
>> Thanks for the explanation.
>>
>>
>>
>>
>> On Mon, Nov 28, 2016 at 10:04 AM, David Blaikie <dblaikie at gmail.com>
>> wrote:
>>
>>
>>
>> On Tue, Nov 22, 2016 at 2:59 PM Dehao Chen via llvm-commits <
>> llvm-commits at lists.llvm.org> wrote:
>>
>> Author: dehao
>> Date: Tue Nov 22 16:50:01 2016
>> New Revision: 287710
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=287710&view=rev
>> Log:
>> Before sample pgo annotation, do not inline a function that has no debug
>> info. (NFC)
>>
>>
>> (+1 to Paul's comment - this sounds like a change, has test changes, and
>> is probably not NFC)
>>
>> Also: This sounds like it does the opposite of what we want as a
>> requirement: that the presence or absence of debug info should not change
>> the resulting code.
>>
>> What's the deal?
>>
>>
>> This particular pass requires debug info to make decisions.
>>
>>
>> Sorry, clearly not my area of understanding/expertise - could you give a
>> brief description of why/how that is?
>>
>>
>> Sample profile is represented as a map from Debug Loc to sample count.
>> The SampleProfileLoader pass reads in the profile, and then traverse the IR
>> to read its debug loc and try to find a match from the map. If a match is
>> found, then the count is used to annotate the IR. (The actual process is
>> more complication, but this is the basic idea).
>>
>> So this pass requires that all IR has DebugLoc available, otherwise it
>> cannot find a match in the profile. As a result, we enforce debug loc to be
>> available before this pass when building with -fprofile-sample-use
>>
>>
>> How does this relate to the choice of inlining, though? & if the
>> algorithm has trouble with functions containing unassociated debug info
>> (which I'm suspecting is where the inf loop came in? not sure) that could
>> be a problem that might come up if it's fed more 'interesting' IR, even
>> without making this inlining choice.
>>
>>
>> The inlining heuristic before sample pgo annotation is: if a callsite is
>> inlined in the profiling binary, and the total sample of the inlined callee
>> exceeds a threshold, we will inline the callsite before annotation.
>>
>> The inf loop problem is, when the callee has no debug info, when inlining
>> callee, all it's IR will inherit the debug info of the callsite (which is
>> marked as should_inline, as it satisfied the above condition). If there is
>> a recursive call in the callee, then this callsite will always inherit the
>> debug info of that hot-inlined-callsite, and keep getting inlined. So the
>> fix here is, we not only check if a callsite is inlined & hot in profiling
>> binary, but also check if the callee has profile: if not, we will not
>> inlining it because inlining it does not help annotation as no debug info
>> is available to annotation.
>>
>
> Hrm.
>
> It seems (admittedly naively) to me that something is a bit confused if
> that's how it all happens.
>
> If the original profile did have the function inlined, and attributed all
> the hits in the inlined code to the call site location - that's already
> potentially erroneous, right? If the called function has loops,
> conditionals, etc, then it's not all equally hot but appears that way in
> the profile.
>

I suppose the question is about general inlining case, not the
no-debug-info case.

If a callsite is cold, and the callee is hot due to loops etc, we will
inline the callsite so that it may expose further optimization
opportunities to the hot callee. This is hot-callee heuristic is also used
by instr-based PGO.


> Should discriminators catch/diagnose that?
>
> If the profile was accurate, and it really was that hot all the way down,
> then the inlining would be reasonable/correct. (& would eventually
> terminate if modeled correctly - because the recursive call would
> eventually be not hot enough, right?)
>

I suppose this goes back to the no-debug info case.

The inlining happens before annotation. The purpose is to make the IR
resemble the profile. So at this stage, profile is only used to find hot
inline instance (not hot callsite) to inline. Scaling of count does not
happen here because profile is not annotated yet. So the problem, as Paul
pointed out, is that without debug info, the recursive callsite will have
incorrect DebugLoc and will be mistakenly marked as *should_inline*.

Dehao


>
> *puzzles*
>
> - Dave
>
>
>>
>> Dehao
>>
>>
>>
>> Perhaps you could explain further how the inf loop comes up/how this
>> addresses it?
>>
>> (& more out of curiosity, how the inlining choices are made, why they're
>> important here)
>>
>>
>> Hopeful above explained the inline choice here. The reason it is
>> important is:
>>
>> 1. it makes profile annotation natural and easy
>> 2. it reserves profile info for all copies of hot code (context-sensitive
>> profile). E.g. if foo is inlined into different callers with different
>> behavior, all these behavior will be reserved in profile instead of merged
>> into one profile.
>>
>>
>>
>> This is mandated by pass manager: even with -g0, PM will make sure debug
>> info is available before this pass.
>>
>>
>> The PM doesn't produce debug info, though - the client building IR
>> (Clang, etc) does.
>>
>>
>> Right, it's the PM, but ParseCodeGenArgs (in frontend) that enforces
>> debug info:
>>
>>   // If the user requested to use a sample profile for PGO, then the
>>   // backend will need to track source location information so the profile
>>   // can be incorporated into the IR.
>>   if (!Opts.SampleProfileFile.empty())
>>     NeedLocTracking = true;
>>
>>
>>
>> But if user manually removed the debug info, the code should also handle
>> it correctly, which is fixed by this patch.
>>
>> Dehao
>>
>>
>>
>>
>>
>> If there is no debug info in the callee, inlining it will not help
>> annotator. This avoids infinite loop as reported in PR/31119.
>>
>> Added:
>>     llvm/trunk/test/Transforms/SampleProfile/Inputs/nodebug.prof
>>     llvm/trunk/test/Transforms/SampleProfile/nodebug.ll
>> Modified:
>>     llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp
>>     llvm/trunk/test/Transforms/SampleProfile/early-inline.ll
>>
>> Modified: llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/
>> Transforms/IPO/SampleProfile.cpp?rev=287710&r1=287709&r2=287710&view=diff
>> ============================================================
>> ==================
>> --- llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp (original)
>> +++ llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp Tue Nov 22 16:50:01
>> 2016
>> @@ -651,6 +651,8 @@ bool SampleProfileLoader::inlineHotFunct
>>        InvokeInst *II = dyn_cast<InvokeInst>(I);
>>        Function *CalledFunction =
>>            (CI == nullptr ? II->getCalledFunction() :
>> CI->getCalledFunction());
>> +      if (!CalledFunction || !CalledFunction->getSubprogram())
>> +        continue;
>>        DebugLoc DLoc = I->getDebugLoc();
>>        uint64_t NumSamples = findCalleeFunctionSamples(*I)-
>> >getTotalSamples();
>>        if ((CI && InlineFunction(CI, IFI)) || (II && InlineFunction(II,
>> IFI))) {
>>
>> Added: llvm/trunk/test/Transforms/SampleProfile/Inputs/nodebug.prof
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/
>> Transforms/SampleProfile/Inputs/nodebug.prof?rev=287710&view=auto
>> ============================================================
>> ==================
>> --- llvm/trunk/test/Transforms/SampleProfile/Inputs/nodebug.prof (added)
>> +++ llvm/trunk/test/Transforms/SampleProfile/Inputs/nodebug.prof Tue Nov
>> 22 16:50:01 2016
>> @@ -0,0 +1,2 @@
>> +foo:100:10
>> + 0: bar:10
>>
>> Modified: llvm/trunk/test/Transforms/SampleProfile/early-inline.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/
>> Transforms/SampleProfile/early-inline.ll?rev=287710&r1=
>> 287709&r2=287710&view=diff
>> ============================================================
>> ==================
>> --- llvm/trunk/test/Transforms/SampleProfile/early-inline.ll (original)
>> +++ llvm/trunk/test/Transforms/SampleProfile/early-inline.ll Tue Nov 22
>> 16:50:01 2016
>> @@ -28,7 +28,7 @@ define void @_Z3foov() #0 personality i8
>>  }
>>
>>  ; Function Attrs: nounwind uwtable
>> -define internal void @_ZL3barv() #1 {
>> +define internal void @_ZL3barv() !dbg !12 {
>>    ret void
>>  }
>>
>> @@ -45,3 +45,4 @@ declare i32 @__gxx_personality_v0(...)
>>  !9 = !DILocation(line: 6, column: 3, scope: !6)
>>  !10 = !DILocation(line: 8, column: 5, scope: !11)
>>  !11 = distinct !DILexicalBlock(scope: !6, file: !1, line: 7, column: 7)
>> +!12 = distinct !DISubprogram(linkageName: "_ZL3barv", scope: !1, line:
>> 20, scopeLine: 20, unit: !0)
>>
>> Added: llvm/trunk/test/Transforms/SampleProfile/nodebug.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/
>> Transforms/SampleProfile/nodebug.ll?rev=287710&view=auto
>> ============================================================
>> ==================
>> --- llvm/trunk/test/Transforms/SampleProfile/nodebug.ll (added)
>> +++ llvm/trunk/test/Transforms/SampleProfile/nodebug.ll Tue Nov 22
>> 16:50:01 2016
>> @@ -0,0 +1,20 @@
>> +; RUN: opt < %s -sample-profile -sample-profile-file=%S/
>> Inputs/nodebug.prof
>> +
>> +define void @foo() !dbg !3 {
>> +  call void @bar(), !dbg !4
>> +  ret void
>> +}
>> +
>> +define void @bar() {
>> +  call void @bar()
>> +  ret void
>> +}
>> +
>> +!llvm.dbg.cu = !{!0}
>> +!llvm.module.flags = !{!2}
>> +
>> +!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1)
>> +!1 = !DIFile(filename: "t", directory: "/tmp/")
>> +!2 = !{i32 2, !"Debug Info Version", i32 3}
>> +!3 = distinct !DISubprogram(name: "a", scope: !1, file: !1, line: 10,
>> unit: !0)
>> +!4 = !DILocation(line: 10, scope: !3)
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161129/c5ef0be9/attachment.html>


More information about the llvm-commits mailing list