[PATCH] D36976: [Inliner] Tweak the condition for determining cold callsites.

Easwaran Raman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 21 12:21:42 PDT 2017


eraman created this revision.

This addresses the following issue. Consider a call chain A->B->C, where
the B->C callsite is guarded by an if with builtin_expect (cond, 0).
When BFI is computed for B, it essentially gives a non-zero value to the
coldest reachable block and scales all blocks including the entry based
on that. The inliner then skips B-> callsite since it is cold and then,
let us assume, inlines B into A. Now, we incrementally updates A's BFI
by including the newly cloned blocks. This incremental update doesn't
update the entry BFI of A (since we only want to update the frequencies
of newly cloned blocks). As a result, we have A's entry having some low
block frequency and B->C clone has a frequency of 0. This results in
CallSiteFreq < CallerEntryFreq * ColdProb to evaluate to false since
both LHS and RHS are 0. This tweak ensures that the cloned callsite is
still 0. An alterantive is to explicitly check if CallSiteFreq is 0.


https://reviews.llvm.org/D36976

Files:
  lib/Analysis/InlineCost.cpp
  test/Transforms/Inline/inline-cold-callsite2.ll


Index: test/Transforms/Inline/inline-cold-callsite2.ll
===================================================================
--- /dev/null
+++ test/Transforms/Inline/inline-cold-callsite2.ll
@@ -0,0 +1,35 @@
+; RUN: opt < %s -passes='require<profile-summary>,cgscc(inline)' -inline-cold-callsite-threshold=0 -S | FileCheck %s
+define void @bar(i32 %a) local_unnamed_addr {
+entry:
+  tail call void @baz(i32 %a)
+  tail call void @baz(i32 %a)
+  tail call void @baz(i32 %a)
+  ret void
+}
+
+declare void @baz(i32) local_unnamed_addr
+
+define void @foo(i32 %a) local_unnamed_addr {
+entry:
+  %cmp = icmp slt i32 %a, 10
+  br i1 %cmp, label %if.then, label %if.end, !prof !2
+
+if.then:                                          ; preds = %entry
+  call void @bar(i32 %a)
+  br label %if.end
+
+if.end:                                           ; preds = %if.then, %entry
+  ret void
+}
+
+; CHECK-LABEL: @foobar
+define void @foobar(i32 %a) local_unnamed_addr {
+entry:
+  %mul = shl nsw i32 %a, 1
+; CHECK-NOT: call void @foo
+; CHECK: call void @bar
+  call void @foo(i32 %mul)
+  ret void
+}
+
+!2 = !{!"branch_weights", i32 1, i32 2000}
Index: lib/Analysis/InlineCost.cpp
===================================================================
--- lib/Analysis/InlineCost.cpp
+++ lib/Analysis/InlineCost.cpp
@@ -675,7 +675,7 @@
   auto CallSiteFreq = CallerBFI->getBlockFreq(CallSiteBB);
   auto CallerEntryFreq =
       CallerBFI->getBlockFreq(&(CS.getCaller()->getEntryBlock()));
-  return CallSiteFreq < CallerEntryFreq * ColdProb;
+  return CallSiteFreq <= CallerEntryFreq * ColdProb;
 }
 
 Optional<int>


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D36976.112030.patch
Type: text/x-patch
Size: 1608 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170821/ee428857/attachment.bin>


More information about the llvm-commits mailing list