[PATCH] D44399: [ThinLTO] Add funtions in callees metadata to CallGraphEdges

Taewook Oh via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 12 12:16:35 PDT 2018


twoh created this revision.
twoh added reviewers: tejohnson, mehdi_amini, pcc.
Herald added subscribers: eraman, inglorion.

If there's a callees metadata attached to the indirect call instruction, add CallGraphEdges to the callees mentioned in the metadata when computing FunctionSummary.

- Why this is necessary:

Consider following code example:

  (foo.c)
  static int f1(int x) {...}
  static int f2(int x);
  static int (*fptr)(int) = f2;
  static int f2(int x) { 
    if (x) fptr=f1; return f1(x);
  }
  int foo(int x) { 
    (*fptr)(x); // !callees metadata of !{i32 (i32)* @f1, i32 (i32)* @f2} would be attached to this call.
  }
  
  (bar.c)
  int bar(int x) { 
    return foo(x); 
  }

At LTO time when `foo.o` is imported into `bar.o`, function `foo` might be inlined into `bar` and PGO-guided indirect call promotion will run after that. If the profile data tells that the promotion of `@f1` or `@f2` is beneficial, the optimizer will check if the "promoted" `@f1` or `@f2` (such as `@f1.llvm.0` or `@f2.llvm.0`) is available. Without this patch, importing `!callees` metadata would only add promoted declarations of `@f1` and `@f2` to the `bar.o`, but still the optimizer will assume that the function is available and perform the promotion. The result of that is link failure with `undefined reference to @f1.llvm.0`.

This patch fixes this problem by adding callees in the `!callees` metadata to CallGraphEdges so that their definition would be properly imported into.

One may ask that there already is a logic to add indirect call promotion targets to be added to CallGraphEdges. However, if profile data says "indirect call promotion is only beneficial under a certain inline context", the logic wouldn't work. In the code example above, if profile data is like

  bar:1000000:100000
    1:100000
      1: foo:100000
          1: 100000 f1:100000

, Computing FunctionSummary for `foo.o` wouldn't add `foo->f1` to CallGraphEdges. (Also, it is at least "possible" that one can provide profile data to only link step but not to compilation step).


Repository:
  rL LLVM

https://reviews.llvm.org/D44399

Files:
  lib/Analysis/ModuleSummaryAnalysis.cpp
  test/ThinLTO/X86/Inputs/callees-metadata.ll
  test/ThinLTO/X86/callees-metadata.ll


Index: test/ThinLTO/X86/callees-metadata.ll
===================================================================
--- /dev/null
+++ test/ThinLTO/X86/callees-metadata.ll
@@ -0,0 +1,21 @@
+; Do setup work: generate bitcode and combined index
+; RUN: opt -module-summary %s -o %t1.bc
+; RUN: opt -module-summary %p/Inputs/callees-metadata.ll -o %t2.bc
+
+; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps \
+; RUN:     -r=%t1.bc,bar,plx \
+; RUN:     -r=%t1.bc,foo,l \
+; RUN:     -r=%t2.bc,foo,pl
+; RUN: llvm-dis %t.o.1.3.import.bc -o - | FileCheck %s
+; CHECK: define {{.*}} i32 @f1.llvm.0
+
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define dso_local i32 @bar(i32 %x) {
+entry:
+  %call = call i32 @foo(i32 %x)
+  ret i32 %call
+}
+
+declare dso_local i32 @foo(i32)
Index: test/ThinLTO/X86/Inputs/callees-metadata.ll
===================================================================
--- /dev/null
+++ test/ThinLTO/X86/Inputs/callees-metadata.ll
@@ -0,0 +1,34 @@
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+ at fptr = internal unnamed_addr global i32 (i32)* @f2, align 8
+
+define dso_local i32 @foo(i32 %x) local_unnamed_addr {
+entry:
+  %0 = load i32 (i32)*, i32 (i32)** @fptr, align 8
+  %call = tail call i32 %0(i32 %x), !callees !0
+  ret i32 %call
+}
+
+define internal i32 @f2(i32 %x) {
+entry:
+  %tobool = icmp eq i32 %x, 0
+  br i1 %tobool, label %if.end, label %if.then
+
+if.then:                                          ; preds = %entry
+  store i32 (i32)* @f1, i32 (i32)** @fptr, align 8
+  %sub.i = add nsw i32 %x, -1
+  br label %if.end
+
+if.end:                                           ; preds = %entry, %if.then
+  %phi.call = phi i32 [ %sub.i, %if.then ], [ -1, %entry ]
+  ret i32 %phi.call
+}
+
+define internal i32 @f1(i32 %x) {
+entry:
+  %sub = add nsw i32 %x, -1
+  ret i32 %sub
+}
+
+!0 = !{i32 (i32)* @f1, i32 (i32)* @f2}
Index: lib/Analysis/ModuleSummaryAnalysis.cpp
===================================================================
--- lib/Analysis/ModuleSummaryAnalysis.cpp
+++ lib/Analysis/ModuleSummaryAnalysis.cpp
@@ -291,6 +291,16 @@
         if (!CalledValue || isa<Constant>(CalledValue))
           continue;
 
+        // Check if the instruction has a callees metadata. If so, add calles
+        // to CallGraphEdges.
+        if (auto *MD = I.getMetadata(LLVMContext::MD_callees)) {
+          for (auto &Op : MD->operands()) {
+            Function *Callee = mdconst::extract_or_null<Function>(Op);
+            if (Callee)
+              CallGraphEdges[Index.getOrInsertValueInfo(Callee)];
+          }
+        }
+
         uint32_t NumVals, NumCandidates;
         uint64_t TotalCount;
         auto CandidateProfileData =


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D44399.138069.patch
Type: text/x-patch
Size: 2806 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180312/30a17c58/attachment.bin>


More information about the llvm-commits mailing list