[llvm] r287710 - Before sample pgo annotation, do not inline a function that has no debug info. (NFC)

Robinson, Paul via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 29 12:42:24 PST 2016


If the original profile did have the function inlined, and attributed all the hits in the inlined code to the call site location - that's already potentially erroneous, right? If the called function has loops, conditionals, etc, then it's not all equally hot but appears that way in the profile. Should discriminators catch/diagnose that?

The problem is with functions that are inlined and are also marked 'nodebug' so there aren't any discriminators.
--paulr

From: David Blaikie [mailto:dblaikie at gmail.com]
Sent: Tuesday, November 29, 2016 12:00 PM
To: Dehao Chen
Cc: Robinson, Paul; llvm-commits
Subject: Re: [llvm] r287710 - Before sample pgo annotation, do not inline a function that has no debug info. (NFC)

(oh, and I tidied this up a bit in r288192 using the CallSite helper, hope that looks good/is OK)

On Tue, Nov 29, 2016 at 11:59 AM David Blaikie <dblaikie at gmail.com<mailto:dblaikie at gmail.com>> wrote:
On Tue, Nov 29, 2016 at 11:36 AM Dehao Chen <dehao at google.com<mailto:dehao at google.com>> wrote:
On Tue, Nov 29, 2016 at 10:55 AM, David Blaikie <dblaikie at gmail.com<mailto:dblaikie at gmail.com>> wrote:

On Mon, Nov 28, 2016 at 2:47 PM Dehao Chen <dehao at google.com<mailto:dehao at google.com>> wrote:
On Mon, Nov 28, 2016 at 2:32 PM, David Blaikie <dblaikie at gmail.com<mailto:dblaikie at gmail.com>> wrote:

On Mon, Nov 28, 2016 at 2:18 PM Dehao Chen <dehao at google.com<mailto:dehao at google.com>> wrote:
I thought NFC applies to "obvious change to fix corner-case bugs", and apparently I was wrong and I'm sorry about that.

Ah, right - NFC means No Functional Change. We use this to tag pure refactoring changes that do not change the observable behavior of the component in question. (eg: changing the API of SmallVector might not change the observable behavior of LLVM, but it probably still wouldn't be tagged NFC, because we tend to unit test ADTs directly - but changing some pass so that it still optimizes in the same way, but doesn't, say, build some intermediate data structure anymore, would be NFC because at the level we test that pass, nothing has changed)

What's the recommend steps to move forward for situations like this? Shall I revert the patch and send out a code review?

Doesn't seem like there's anything needed to be done - just an FYI of don't use 'NFC' when semantics/behavior do change. You included what appears to be appropriate test coverage, etc, so no worries there.

Thanks for the explanation.


On Mon, Nov 28, 2016 at 10:04 AM, David Blaikie <dblaikie at gmail.com<mailto:dblaikie at gmail.com>> wrote:

On Tue, Nov 22, 2016 at 2:59 PM Dehao Chen via llvm-commits <llvm-commits at lists.llvm.org<mailto:llvm-commits at lists.llvm.org>> wrote:
Author: dehao
Date: Tue Nov 22 16:50:01 2016
New Revision: 287710

URL: http://llvm.org/viewvc/llvm-project?rev=287710&view=rev
Log:
Before sample pgo annotation, do not inline a function that has no debug info. (NFC)

(+1 to Paul's comment - this sounds like a change, has test changes, and is probably not NFC)

Also: This sounds like it does the opposite of what we want as a requirement: that the presence or absence of debug info should not change the resulting code.

What's the deal?

This particular pass requires debug info to make decisions.

Sorry, clearly not my area of understanding/expertise - could you give a brief description of why/how that is?

Sample profile is represented as a map from Debug Loc to sample count. The SampleProfileLoader pass reads in the profile, and then traverse the IR to read its debug loc and try to find a match from the map. If a match is found, then the count is used to annotate the IR. (The actual process is more complication, but this is the basic idea).

So this pass requires that all IR has DebugLoc available, otherwise it cannot find a match in the profile. As a result, we enforce debug loc to be available before this pass when building with -fprofile-sample-use

How does this relate to the choice of inlining, though? & if the algorithm has trouble with functions containing unassociated debug info (which I'm suspecting is where the inf loop came in? not sure) that could be a problem that might come up if it's fed more 'interesting' IR, even without making this inlining choice.

The inlining heuristic before sample pgo annotation is: if a callsite is inlined in the profiling binary, and the total sample of the inlined callee exceeds a threshold, we will inline the callsite before annotation.

The inf loop problem is, when the callee has no debug info, when inlining callee, all it's IR will inherit the debug info of the callsite (which is marked as should_inline, as it satisfied the above condition). If there is a recursive call in the callee, then this callsite will always inherit the debug info of that hot-inlined-callsite, and keep getting inlined. So the fix here is, we not only check if a callsite is inlined & hot in profiling binary, but also check if the callee has profile: if not, we will not inlining it because inlining it does not help annotation as no debug info is available to annotation.

Hrm.

It seems (admittedly naively) to me that something is a bit confused if that's how it all happens.

If the original profile did have the function inlined, and attributed all the hits in the inlined code to the call site location - that's already potentially erroneous, right? If the called function has loops, conditionals, etc, then it's not all equally hot but appears that way in the profile. Should discriminators catch/diagnose that?

If the profile was accurate, and it really was that hot all the way down, then the inlining would be reasonable/correct. (& would eventually terminate if modeled correctly - because the recursive call would eventually be not hot enough, right?)

*puzzles*

- Dave


Dehao


Perhaps you could explain further how the inf loop comes up/how this addresses it?

(& more out of curiosity, how the inlining choices are made, why they're important here)

Hopeful above explained the inline choice here. The reason it is important is:

1. it makes profile annotation natural and easy
2. it reserves profile info for all copies of hot code (context-sensitive profile). E.g. if foo is inlined into different callers with different behavior, all these behavior will be reserved in profile instead of merged into one profile.


This is mandated by pass manager: even with -g0, PM will make sure debug info is available before this pass.

The PM doesn't produce debug info, though - the client building IR (Clang, etc) does.

Right, it's the PM, but ParseCodeGenArgs (in frontend) that enforces debug info:

  // If the user requested to use a sample profile for PGO, then the
  // backend will need to track source location information so the profile
  // can be incorporated into the IR.
  if (!Opts.SampleProfileFile.empty())
    NeedLocTracking = true;


But if user manually removed the debug info, the code should also handle it correctly, which is fixed by this patch.

Dehao



If there is no debug info in the callee, inlining it will not help annotator. This avoids infinite loop as reported in PR/31119.

Added:
    llvm/trunk/test/Transforms/SampleProfile/Inputs/nodebug.prof
    llvm/trunk/test/Transforms/SampleProfile/nodebug.ll
Modified:
    llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp
    llvm/trunk/test/Transforms/SampleProfile/early-inline.ll

Modified: llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp?rev=287710&r1=287709&r2=287710&view=diff
==============================================================================
--- llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp (original)
+++ llvm/trunk/lib/Transforms/IPO/SampleProfile.cpp Tue Nov 22 16:50:01 2016
@@ -651,6 +651,8 @@ bool SampleProfileLoader::inlineHotFunct
       InvokeInst *II = dyn_cast<InvokeInst>(I);
       Function *CalledFunction =
           (CI == nullptr ? II->getCalledFunction() : CI->getCalledFunction());
+      if (!CalledFunction || !CalledFunction->getSubprogram())
+        continue;
       DebugLoc DLoc = I->getDebugLoc();
       uint64_t NumSamples = findCalleeFunctionSamples(*I)->getTotalSamples();
       if ((CI && InlineFunction(CI, IFI)) || (II && InlineFunction(II, IFI))) {

Added: llvm/trunk/test/Transforms/SampleProfile/Inputs/nodebug.prof
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/Inputs/nodebug.prof?rev=287710&view=auto
==============================================================================
--- llvm/trunk/test/Transforms/SampleProfile/Inputs/nodebug.prof (added)
+++ llvm/trunk/test/Transforms/SampleProfile/Inputs/nodebug.prof Tue Nov 22 16:50:01 2016
@@ -0,0 +1,2 @@
+foo:100:10
+ 0: bar:10

Modified: llvm/trunk/test/Transforms/SampleProfile/early-inline.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/early-inline.ll?rev=287710&r1=287709&r2=287710&view=diff
==============================================================================
--- llvm/trunk/test/Transforms/SampleProfile/early-inline.ll (original)
+++ llvm/trunk/test/Transforms/SampleProfile/early-inline.ll Tue Nov 22 16:50:01 2016
@@ -28,7 +28,7 @@ define void @_Z3foov() #0 personality i8
 }

 ; Function Attrs: nounwind uwtable
-define internal void @_ZL3barv() #1 {
+define internal void @_ZL3barv() !dbg !12 {
   ret void
 }

@@ -45,3 +45,4 @@ declare i32 @__gxx_personality_v0(...)
 !9 = !DILocation(line: 6, column: 3, scope: !6)
 !10 = !DILocation(line: 8, column: 5, scope: !11)
 !11 = distinct !DILexicalBlock(scope: !6, file: !1, line: 7, column: 7)
+!12 = distinct !DISubprogram(linkageName: "_ZL3barv", scope: !1, line: 20, scopeLine: 20, unit: !0)

Added: llvm/trunk/test/Transforms/SampleProfile/nodebug.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SampleProfile/nodebug.ll?rev=287710&view=auto
==============================================================================
--- llvm/trunk/test/Transforms/SampleProfile/nodebug.ll (added)
+++ llvm/trunk/test/Transforms/SampleProfile/nodebug.ll Tue Nov 22 16:50:01 2016
@@ -0,0 +1,20 @@
+; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/nodebug.prof
+
+define void @foo() !dbg !3 {
+  call void @bar(), !dbg !4
+  ret void
+}
+
+define void @bar() {
+  call void @bar()
+  ret void
+}
+
+!llvm.dbg.cu<http://llvm.dbg.cu> = !{!0}
+!llvm.module.flags = !{!2}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !1)
+!1 = !DIFile(filename: "t", directory: "/tmp/")
+!2 = !{i32 2, !"Debug Info Version", i32 3}
+!3 = distinct !DISubprogram(name: "a", scope: !1, file: !1, line: 10, unit: !0)
+!4 = !DILocation(line: 10, scope: !3)


_______________________________________________
llvm-commits mailing list
llvm-commits at lists.llvm.org<mailto:llvm-commits at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161129/21b0abbe/attachment.html>


More information about the llvm-commits mailing list