[llvm-branch-commits] [clang] 24d4291 - [CSSPGO] Pseudo probes for function calls.

Hongtao Yu via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Wed Dec 2 13:49:55 PST 2020


Author: Hongtao Yu
Date: 2020-12-02T13:45:20-08:00
New Revision: 24d4291ca704fa5ee2419b4163aa324eca693fd6

URL: https://github.com/llvm/llvm-project/commit/24d4291ca704fa5ee2419b4163aa324eca693fd6
DIFF: https://github.com/llvm/llvm-project/commit/24d4291ca704fa5ee2419b4163aa324eca693fd6.diff

LOG: [CSSPGO] Pseudo probes for function calls.

An indirect call site needs to be probed for its potential call targets. With CSSPGO a direct call also needs a probe so that a calling context can be represented by a stack of callsite probes. Unlike pseudo probes for basic blocks that are in form of standalone intrinsic call instructions, pseudo probes for callsites have to be attached to the call instruction, thus a separate instruction would not work.

One possible way of attaching a probe to a call instruction is to use a special metadata that carries information about the probe. The special metadata will have to make its way through the optimization pipeline down to object emission. This requires additional efforts to maintain the metadata in various places. Given that the `!dbg` metadata is a first-class metadata and has all essential support in place , leveraging the `!dbg` metadata as a channel to encode pseudo probe information is probably the easiest solution.

With the requirement of not inflating `!dbg` metadata that is allocated for almost every instruction, we found that the 32-bit DWARF discriminator field which mainly serves AutoFDO can be reused for pseudo probes. DWARF discriminators distinguish identical source locations between instructions and with pseudo probes such support is not required. In this change we are using the discriminator field to encode the ID and type of a callsite probe and the encoded value will be unpacked and consumed right before object emission. When a callsite is inlined, the callsite discriminator field will go with the inlined instructions. The `!dbg` metadata of an inlined instruction is in form of a scope stack. The top of the stack is the instruction's original `!dbg` metadata and the bottom of the stack is for the original callsite of the top-level inliner. Except for the top of the stack, all other elements of the stack actually refer to the nested inlined callsites whose discriminator field (which actually represents a calliste probe) can be used together to represent the inline context of an inlined PseudoProbeInst or CallInst.

To avoid collision with the baseline AutoFDO in various places that handles dwarf discriminators where a check against  the `-pseudo-probe-for-profiling` switch is not available, a special encoding scheme is used to tell apart a pseudo probe discriminator from a regular discriminator. For the regular discriminator, if all lowest 3 bits are non-zero, it means the discriminator is basically empty and all higher 29 bits can be reversed for pseudo probe use.

Callsite pseudo probes are inserted in `SampleProfileProbePass` and a target-independent MIR pass `PseudoProbeInserter` is added to unpack the probe ID/type from `!dbg`.

Note that with this work the switch -debug-info-for-profiling will not work with -pseudo-probe-for-profiling anymore. They cannot be used at the same time.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D91756

Added: 
    llvm/include/llvm/IR/PseudoProbe.h
    llvm/lib/CodeGen/PseudoProbeInserter.cpp

Modified: 
    clang/lib/CodeGen/BackendUtil.cpp
    llvm/include/llvm/CodeGen/CommandFlags.h
    llvm/include/llvm/CodeGen/Passes.h
    llvm/include/llvm/IR/DebugInfoMetadata.h
    llvm/include/llvm/InitializePasses.h
    llvm/include/llvm/Passes/PassBuilder.h
    llvm/include/llvm/Target/TargetOptions.h
    llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
    llvm/lib/CodeGen/CMakeLists.txt
    llvm/lib/CodeGen/CommandFlags.cpp
    llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
    llvm/lib/CodeGen/TargetPassConfig.cpp
    llvm/lib/Target/X86/X86TargetMachine.cpp
    llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
    llvm/test/Transforms/SampleProfile/pseudo-probe-emit.ll

Removed: 
    


################################################################################
diff  --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index b62a66a51d26..724e2ec16fc3 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -555,6 +555,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
   Options.ForceDwarfFrameSection = CodeGenOpts.ForceDwarfFrameSection;
   Options.EmitCallSiteInfo = CodeGenOpts.EmitCallSiteInfo;
   Options.EnableAIXExtendedAltivecABI = CodeGenOpts.EnableAIXExtendedAltivecABI;
+  Options.PseudoProbeForProfiling = CodeGenOpts.PseudoProbeForProfiling;
   Options.ValueTrackingVariableLocations =
       CodeGenOpts.ValueTrackingVariableLocations;
   Options.XRayOmitFunctionIndex = CodeGenOpts.XRayOmitFunctionIndex;

diff  --git a/llvm/include/llvm/CodeGen/CommandFlags.h b/llvm/include/llvm/CodeGen/CommandFlags.h
index e18d16b4b396..ee4a9a80662e 100644
--- a/llvm/include/llvm/CodeGen/CommandFlags.h
+++ b/llvm/include/llvm/CodeGen/CommandFlags.h
@@ -127,6 +127,8 @@ bool getEnableMachineFunctionSplitter();
 
 bool getEnableDebugEntryValues();
 
+bool getPseudoProbeForProfiling();
+
 bool getValueTrackingVariableLocations();
 
 bool getForceDwarfFrameSection();

diff  --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h
index 6bd1b553c506..e33e62dc8e7c 100644
--- a/llvm/include/llvm/CodeGen/Passes.h
+++ b/llvm/include/llvm/CodeGen/Passes.h
@@ -475,6 +475,9 @@ namespace llvm {
   /// Create Hardware Loop pass. \see HardwareLoops.cpp
   FunctionPass *createHardwareLoopsPass();
 
+  /// This pass inserts pseudo probe annotation for callsite profiling.
+  FunctionPass *createPseudoProbeInserter();
+
   /// Create IR Type Promotion pass. \see TypePromotion.cpp
   FunctionPass *createTypePromotionPass();
 

diff  --git a/llvm/include/llvm/IR/DebugInfoMetadata.h b/llvm/include/llvm/IR/DebugInfoMetadata.h
index 004c3e589378..0ba4c09dc92f 100644
--- a/llvm/include/llvm/IR/DebugInfoMetadata.h
+++ b/llvm/include/llvm/IR/DebugInfoMetadata.h
@@ -1698,6 +1698,18 @@ class DILocation : public MDNode {
 
   inline unsigned getDiscriminator() const;
 
+  // For the regular discriminator, it stands for all empty components if all
+  // the lowest 3 bits are non-zero and all higher 29 bits are unused(zero by
+  // default). Here we fully leverage the higher 29 bits for pseudo probe use.
+  // This is the format:
+  // [2:0] - 0x7
+  // [31:3] - pseudo probe fields guaranteed to be non-zero as a whole
+  // So if the lower 3 bits is non-zero and the others has at least one
+  // non-zero bit, it guarantees to be a pseudo probe discriminator
+  inline static bool isPseudoProbeDiscriminator(unsigned Discriminator) {
+    return ((Discriminator & 0x7) == 0x7) && (Discriminator & 0xFFFFFFF8);
+  }
+
   /// Returns a new DILocation with updated \p Discriminator.
   inline const DILocation *cloneWithDiscriminator(unsigned Discriminator) const;
 

diff  --git a/llvm/include/llvm/IR/PseudoProbe.h b/llvm/include/llvm/IR/PseudoProbe.h
new file mode 100644
index 000000000000..fd3a0075efbe
--- /dev/null
+++ b/llvm/include/llvm/IR/PseudoProbe.h
@@ -0,0 +1,52 @@
+//===- PseudoProbe.h - Pseudo Probe IR Helpers ------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Pseudo probe IR intrinsic and dwarf discriminator manipulation routines.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_IR_PSEUDOPROBE_H
+#define LLVM_IR_PSEUDOPROBE_H
+
+#include <cassert>
+#include <cstdint>
+
+namespace llvm {
+
+enum class PseudoProbeType { Block = 0, IndirectCall, DirectCall };
+
+struct PseudoProbeDwarfDiscriminator {
+  // The following APIs encodes/decodes per-probe information to/from a
+  // 32-bit integer which is organized as:
+  //  [2:0] - 0x7, this is reserved for regular discriminator,
+  //          see DWARF discriminator encoding rule
+  //  [18:3] - probe id
+  //  [25:19] - reserved
+  //  [28:26] - probe type, see PseudoProbeType
+  //  [31:29] - reserved for probe attributes
+  static uint32_t packProbeData(uint32_t Index, uint32_t Type) {
+    assert(Index <= 0xFFFF && "Probe index too big to encode, exceeding 2^16");
+    assert(Type <= 0x7 && "Probe type too big to encode, exceeding 7");
+    return (Index << 3) | (Type << 26) | 0x7;
+  }
+
+  static uint32_t extractProbeIndex(uint32_t Value) {
+    return (Value >> 3) & 0xFFFF;
+  }
+
+  static uint32_t extractProbeType(uint32_t Value) {
+    return (Value >> 26) & 0x7;
+  }
+
+  static uint32_t extractProbeAttributes(uint32_t Value) {
+    return (Value >> 29) & 0x7;
+  }
+};
+} // end namespace llvm
+
+#endif // LLVM_IR_PSEUDOPROBE_H

diff  --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h
index 0d0eaf0bca83..65f2fb0025e9 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -361,6 +361,7 @@ void initializeProfileSummaryInfoWrapperPassPass(PassRegistry&);
 void initializePromoteLegacyPassPass(PassRegistry&);
 void initializePruneEHPass(PassRegistry&);
 void initializeRABasicPass(PassRegistry&);
+void initializePseudoProbeInserterPass(PassRegistry &);
 void initializeRAGreedyPass(PassRegistry&);
 void initializeReachingDefAnalysisPass(PassRegistry&);
 void initializeReassociateLegacyPassPass(PassRegistry&);

diff  --git a/llvm/include/llvm/Passes/PassBuilder.h b/llvm/include/llvm/Passes/PassBuilder.h
index 7a5ead6baf65..999b9b782c46 100644
--- a/llvm/include/llvm/Passes/PassBuilder.h
+++ b/llvm/include/llvm/Passes/PassBuilder.h
@@ -63,6 +63,14 @@ struct PGOOptions {
     // PseudoProbeForProfiling needs to be true.
     assert(this->Action != NoAction || this->CSAction != NoCSAction ||
            this->DebugInfoForProfiling || this->PseudoProbeForProfiling);
+
+    // Pseudo probe emission does work with -fdebug-info-for-profiling since
+    // they both use the discriminator field of debug lines but for 
diff erent
+    // purposes.
+    if (this->DebugInfoForProfiling && this->PseudoProbeForProfiling) {
+      report_fatal_error(
+          "Pseudo probes cannot be used with -debug-info-for-profiling", false);
+    }
   }
   std::string ProfileFile;
   std::string CSProfileGenFile;

diff  --git a/llvm/include/llvm/Target/TargetOptions.h b/llvm/include/llvm/Target/TargetOptions.h
index 8cc0b95b26da..c7391b529810 100644
--- a/llvm/include/llvm/Target/TargetOptions.h
+++ b/llvm/include/llvm/Target/TargetOptions.h
@@ -138,8 +138,8 @@ namespace llvm {
           EnableMachineFunctionSplitter(false), SupportsDefaultOutlining(false),
           EmitAddrsig(false), EmitCallSiteInfo(false),
           SupportsDebugEntryValues(false), EnableDebugEntryValues(false),
-          ValueTrackingVariableLocations(false), ForceDwarfFrameSection(false),
-          XRayOmitFunctionIndex(false),
+          PseudoProbeForProfiling(false), ValueTrackingVariableLocations(false),
+          ForceDwarfFrameSection(false), XRayOmitFunctionIndex(false),
           FPDenormalMode(DenormalMode::IEEE, DenormalMode::IEEE) {}
 
     /// DisableFramePointerElim - This returns true if frame pointer elimination
@@ -309,6 +309,9 @@ namespace llvm {
     /// production.
     bool ShouldEmitDebugEntryValues() const;
 
+    /// Emit pseudo probes into the binary for sample profiling
+    unsigned PseudoProbeForProfiling : 1;
+
     // When set to true, use experimental new debug variable location tracking,
     // which seeks to follow the values of variables rather than their location,
     // post isel.

diff  --git a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
index d0165a1cbc71..0c0e0b7f4a4a 100644
--- a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
+++ b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
@@ -17,6 +17,7 @@
 
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/IR/PassManager.h"
+#include "llvm/IR/PseudoProbe.h"
 #include "llvm/Target/TargetMachine.h"
 #include <unordered_map>
 
@@ -25,10 +26,10 @@ namespace llvm {
 class Module;
 
 using BlockIdMap = std::unordered_map<BasicBlock *, uint32_t>;
+using InstructionIdMap = std::unordered_map<Instruction *, uint32_t>;
 
 enum class PseudoProbeReservedId { Invalid = 0, Last = Invalid };
 
-enum class PseudoProbeType { Block = 0 };
 
 /// Sample profile pseudo prober.
 ///
@@ -42,13 +43,18 @@ class SampleProfileProber {
 private:
   Function *getFunction() const { return F; }
   uint32_t getBlockId(const BasicBlock *BB) const;
+  uint32_t getCallsiteId(const Instruction *Call) const;
   void computeProbeIdForBlocks();
+  void computeProbeIdForCallsites();
 
   Function *F;
 
   /// Map basic blocks to the their pseudo probe ids.
   BlockIdMap BlockProbeIds;
 
+  /// Map indirect calls to the their pseudo probe ids.
+  InstructionIdMap CallProbeIds;
+
   /// The ID of the last probe, Can be used to number a new probe.
   uint32_t LastProbeId;
 };

diff  --git a/llvm/lib/CodeGen/CMakeLists.txt b/llvm/lib/CodeGen/CMakeLists.txt
index d196555ffc94..e7ac998f3efa 100644
--- a/llvm/lib/CodeGen/CMakeLists.txt
+++ b/llvm/lib/CodeGen/CMakeLists.txt
@@ -122,6 +122,7 @@ add_llvm_component_library(LLVMCodeGen
   PreISelIntrinsicLowering.cpp
   ProcessImplicitDefs.cpp
   PrologEpilogInserter.cpp
+  PseudoProbeInserter.cpp
   PseudoSourceValue.cpp
   RDFGraph.cpp
   RDFLiveness.cpp

diff  --git a/llvm/lib/CodeGen/CommandFlags.cpp b/llvm/lib/CodeGen/CommandFlags.cpp
index 080592baa03a..a23099c51e24 100644
--- a/llvm/lib/CodeGen/CommandFlags.cpp
+++ b/llvm/lib/CodeGen/CommandFlags.cpp
@@ -91,6 +91,7 @@ CGOPT(bool, EnableAddrsig)
 CGOPT(bool, EmitCallSiteInfo)
 CGOPT(bool, EnableMachineFunctionSplitter)
 CGOPT(bool, EnableDebugEntryValues)
+CGOPT(bool, PseudoProbeForProfiling)
 CGOPT(bool, ValueTrackingVariableLocations)
 CGOPT(bool, ForceDwarfFrameSection)
 CGOPT(bool, XRayOmitFunctionIndex)
@@ -434,6 +435,11 @@ codegen::RegisterCodeGenFlags::RegisterCodeGenFlags() {
       cl::init(false));
   CGBINDOPT(EnableDebugEntryValues);
 
+  static cl::opt<bool> PseudoProbeForProfiling(
+      "pseudo-probe-for-profiling", cl::desc("Emit pseudo probes for AutoFDO"),
+      cl::init(false));
+  CGBINDOPT(PseudoProbeForProfiling);
+
   static cl::opt<bool> ValueTrackingVariableLocations(
       "experimental-debug-variable-locations",
       cl::desc("Use experimental new value-tracking variable locations"),
@@ -548,6 +554,7 @@ codegen::InitTargetOptionsFromCodeGenFlags(const Triple &TheTriple) {
   Options.EmitAddrsig = getEnableAddrsig();
   Options.EmitCallSiteInfo = getEmitCallSiteInfo();
   Options.EnableDebugEntryValues = getEnableDebugEntryValues();
+  Options.PseudoProbeForProfiling = getPseudoProbeForProfiling();
   Options.ValueTrackingVariableLocations = getValueTrackingVariableLocations();
   Options.ForceDwarfFrameSection = getForceDwarfFrameSection();
   Options.XRayOmitFunctionIndex = getXRayOmitFunctionIndex();

diff  --git a/llvm/lib/CodeGen/PseudoProbeInserter.cpp b/llvm/lib/CodeGen/PseudoProbeInserter.cpp
new file mode 100644
index 000000000000..9c716a5a37ea
--- /dev/null
+++ b/llvm/lib/CodeGen/PseudoProbeInserter.cpp
@@ -0,0 +1,95 @@
+//===- PseudoProbeInserter.cpp - Insert annotation for callsite profiling -===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements PseudoProbeInserter pass, which inserts pseudo probe
+// annotations for call instructions with a pseudo-probe-specific dwarf
+// discriminator. such discriminator indicates that the call instruction comes
+// with a pseudo probe, and the discriminator value holds information to
+// identify the corresponding counter.
+//===----------------------------------------------------------------------===//
+
+#include "llvm/CodeGen/MachineBasicBlock.h"
+#include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/CodeGen/MachineInstr.h"
+#include "llvm/CodeGen/TargetInstrInfo.h"
+#include "llvm/IR/DebugInfoMetadata.h"
+#include "llvm/IR/PseudoProbe.h"
+#include "llvm/InitializePasses.h"
+#include "llvm/Target/TargetMachine.h"
+#include <unordered_map>
+
+#define DEBUG_TYPE "pseudo-probe-inserter"
+
+using namespace llvm;
+
+namespace {
+class PseudoProbeInserter : public MachineFunctionPass {
+public:
+  static char ID;
+
+  PseudoProbeInserter() : MachineFunctionPass(ID) {
+    initializePseudoProbeInserterPass(*PassRegistry::getPassRegistry());
+  }
+
+  StringRef getPassName() const override { return "Pseudo Probe Inserter"; }
+
+  void getAnalysisUsage(AnalysisUsage &AU) const override {
+    AU.setPreservesAll();
+    MachineFunctionPass::getAnalysisUsage(AU);
+  }
+
+  bool runOnMachineFunction(MachineFunction &MF) override {
+    const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
+    bool Changed = false;
+    for (MachineBasicBlock &MBB : MF) {
+      for (MachineInstr &MI : MBB) {
+        if (MI.isCall()) {
+          if (DILocation *DL = MI.getDebugLoc()) {
+            auto Value = DL->getDiscriminator();
+            if (DILocation::isPseudoProbeDiscriminator(Value)) {
+              BuildMI(MBB, MI, DL, TII->get(TargetOpcode::PSEUDO_PROBE))
+                  .addImm(getFuncGUID(MF.getFunction().getParent(), DL))
+                  .addImm(
+                      PseudoProbeDwarfDiscriminator::extractProbeIndex(Value))
+                  .addImm(
+                      PseudoProbeDwarfDiscriminator::extractProbeType(Value))
+                  .addImm(PseudoProbeDwarfDiscriminator::extractProbeAttributes(
+                      Value));
+              Changed = true;
+            }
+          }
+        }
+      }
+    }
+
+    return Changed;
+  }
+
+private:
+  uint64_t getFuncGUID(Module *M, DILocation *DL) {
+    auto *SP = DL->getScope()->getSubprogram();
+    auto Name = SP->getLinkageName();
+    if (Name.empty())
+      Name = SP->getName();
+    return Function::getGUID(Name);
+  }
+};
+} // namespace
+
+char PseudoProbeInserter::ID = 0;
+INITIALIZE_PASS_BEGIN(PseudoProbeInserter, DEBUG_TYPE,
+                      "Insert pseudo probe annotations for value profiling",
+                      false, false)
+INITIALIZE_PASS_DEPENDENCY(TargetPassConfig)
+INITIALIZE_PASS_END(PseudoProbeInserter, DEBUG_TYPE,
+                    "Insert pseudo probe annotations for value profiling",
+                    false, false)
+
+FunctionPass *llvm::createPseudoProbeInserter() {
+  return new PseudoProbeInserter();
+}

diff  --git a/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp b/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
index 2b208cee1716..a5978711b871 100644
--- a/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
@@ -26,6 +26,7 @@
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
 #include "llvm/IR/DataLayout.h"
 #include "llvm/IR/DebugInfo.h"
+#include "llvm/IR/PseudoProbe.h"
 #include "llvm/Support/Debug.h"
 #include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/MathExtras.h"
@@ -1133,7 +1134,7 @@ EmitSpecialNode(SDNode *Node, bool IsClone, bool IsCloned,
     BuildMI(*MBB, InsertPos, Node->getDebugLoc(), TII->get(TarOp))
         .addImm(Guid)
         .addImm(Index)
-        .addImm(0) // 0 for block probes
+        .addImm((uint8_t)PseudoProbeType::Block)
         .addImm(Attr);
     break;
   }

diff  --git a/llvm/lib/CodeGen/TargetPassConfig.cpp b/llvm/lib/CodeGen/TargetPassConfig.cpp
index b877954ca270..fb566323c41b 100644
--- a/llvm/lib/CodeGen/TargetPassConfig.cpp
+++ b/llvm/lib/CodeGen/TargetPassConfig.cpp
@@ -1040,6 +1040,10 @@ void TargetPassConfig::addMachinePasses() {
   // Add passes that directly emit MI after all other MI passes.
   addPreEmitPass2();
 
+  // Insert pseudo probe annotation for callsite profiling
+  if (TM->Options.PseudoProbeForProfiling)
+    addPass(createPseudoProbeInserter());
+
   AddingMachinePasses = false;
 }
 

diff  --git a/llvm/lib/Target/X86/X86TargetMachine.cpp b/llvm/lib/Target/X86/X86TargetMachine.cpp
index 34bc72a2e69f..288841c4916c 100644
--- a/llvm/lib/Target/X86/X86TargetMachine.cpp
+++ b/llvm/lib/Target/X86/X86TargetMachine.cpp
@@ -83,6 +83,7 @@ extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeX86Target() {
   initializeX86LoadValueInjectionRetHardeningPassPass(PR);
   initializeX86OptimizeLEAPassPass(PR);
   initializeX86PartialReductionPass(PR);
+  initializePseudoProbeInserterPass(PR);
 }
 
 static std::unique_ptr<TargetLoweringObjectFile> createTLOF(const Triple &TT) {

diff  --git a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
index e158bc9c148f..652d5b93abd9 100644
--- a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
@@ -37,8 +37,10 @@ STATISTIC(ArtificialDbgLine,
 
 SampleProfileProber::SampleProfileProber(Function &Func) : F(&Func) {
   BlockProbeIds.clear();
+  CallProbeIds.clear();
   LastProbeId = (uint32_t)PseudoProbeReservedId::Last;
   computeProbeIdForBlocks();
+  computeProbeIdForCallsites();
 }
 
 void SampleProfileProber::computeProbeIdForBlocks() {
@@ -47,11 +49,28 @@ void SampleProfileProber::computeProbeIdForBlocks() {
   }
 }
 
+void SampleProfileProber::computeProbeIdForCallsites() {
+  for (auto &BB : *F) {
+    for (auto &I : BB) {
+      if (!isa<CallBase>(I))
+        continue;
+      if (isa<IntrinsicInst>(&I))
+        continue;
+      CallProbeIds[&I] = ++LastProbeId;
+    }
+  }
+}
+
 uint32_t SampleProfileProber::getBlockId(const BasicBlock *BB) const {
   auto I = BlockProbeIds.find(const_cast<BasicBlock *>(BB));
   return I == BlockProbeIds.end() ? 0 : I->second;
 }
 
+uint32_t SampleProfileProber::getCallsiteId(const Instruction *Call) const {
+  auto Iter = CallProbeIds.find(const_cast<Instruction *>(Call));
+  return Iter == CallProbeIds.end() ? 0 : Iter->second;
+}
+
 void SampleProfileProber::instrumentOneFunc(Function &F, TargetMachine *TM) {
   Module *M = F.getParent();
   MDBuilder MDB(F.getContext());
@@ -59,6 +78,28 @@ void SampleProfileProber::instrumentOneFunc(Function &F, TargetMachine *TM) {
   // fine since function name is the only key in the profile database.
   uint64_t Guid = Function::getGUID(F.getName());
 
+  // Assign an artificial debug line to a probe that doesn't come with a real
+  // line. A probe not having a debug line will get an incomplete inline
+  // context. This will cause samples collected on the probe to be counted
+  // into the base profile instead of a context profile. The line number
+  // itself is not important though.
+  auto AssignDebugLoc = [&](Instruction *I) {
+    assert((isa<PseudoProbeInst>(I) || isa<CallBase>(I)) &&
+           "Expecting pseudo probe or call instructions");
+    if (!I->getDebugLoc()) {
+      if (auto *SP = F.getSubprogram()) {
+        auto DIL = DebugLoc::get(0, 0, SP);
+        I->setDebugLoc(DIL);
+        ArtificialDbgLine++;
+        LLVM_DEBUG({
+          dbgs() << "\nIn Function " << F.getName()
+                 << " Probe gets an artificial debug line\n";
+          I->dump();
+        });
+      }
+    }
+  };
+
   // Probe basic blocks.
   for (auto &I : BlockProbeIds) {
     BasicBlock *BB = I.first;
@@ -87,19 +128,26 @@ void SampleProfileProber::instrumentOneFunc(Function &F, TargetMachine *TM) {
     Value *Args[] = {Builder.getInt64(Guid), Builder.getInt64(Index),
                      Builder.getInt32(0)};
     auto *Probe = Builder.CreateCall(ProbeFn, Args);
-    // Assign an artificial debug line to a probe that doesn't come with a real
-    // line. A probe not having a debug line will get an incomplete inline
-    // context. This will cause samples collected on the probe to be counted
-    // into the base profile instead of a context profile. The line number
-    // itself is not important though.
-    if (!Probe->getDebugLoc()) {
-      if (auto *SP = F.getSubprogram()) {
-        auto DIL = DebugLoc::get(0, 0, SP);
-        Probe->setDebugLoc(DIL);
-        ArtificialDbgLine++;
-        LLVM_DEBUG(dbgs() << "\nIn Function " << F.getName() << " Probe "
-                          << Index << " gets an artificial debug line\n";);
-      }
+    AssignDebugLoc(Probe);
+  }
+
+  // Probe both direct calls and indirect calls. Direct calls are probed so that
+  // their probe ID can be used as an call site identifier to represent a
+  // calling context.
+  for (auto &I : CallProbeIds) {
+    auto *Call = I.first;
+    uint32_t Index = I.second;
+    uint32_t Type = cast<CallBase>(Call)->getCalledFunction()
+                        ? (uint32_t)PseudoProbeType::DirectCall
+                        : (uint32_t)PseudoProbeType::IndirectCall;
+    AssignDebugLoc(Call);
+    // Levarge the 32-bit discriminator field of debug data to store the ID and
+    // type of a callsite probe. This gets rid of the dependency on plumbing a
+    // customized metadata through the codegen pipeline.
+    uint32_t V = PseudoProbeDwarfDiscriminator::packProbeData(Index, Type);
+    if (auto DIL = Call->getDebugLoc()) {
+      DIL = DIL->cloneWithDiscriminator(V);
+      Call->setDebugLoc(DIL);
     }
   }
 }

diff  --git a/llvm/test/Transforms/SampleProfile/pseudo-probe-emit.ll b/llvm/test/Transforms/SampleProfile/pseudo-probe-emit.ll
index 1094819208a6..ddb6a62c40d2 100644
--- a/llvm/test/Transforms/SampleProfile/pseudo-probe-emit.ll
+++ b/llvm/test/Transforms/SampleProfile/pseudo-probe-emit.ll
@@ -1,7 +1,7 @@
 ; REQUIRES: x86-registered-target
 ; RUN: opt < %s -passes=pseudo-probe -function-sections -S -o %t
 ; RUN: FileCheck %s < %t --check-prefix=CHECK-IL
-; RUN: llc %t -stop-after=instruction-select -o - | FileCheck %s --check-prefix=CHECK-MIR
+; RUN: llc %t -pseudo-probe-for-profiling -stop-after=pseudo-probe-inserter -o - | FileCheck %s --check-prefix=CHECK-MIR
 ;
 ;; Check the generation of pseudoprobe intrinsic call.
 
@@ -29,9 +29,35 @@ bb3:
   ret void, !dbg !12
 }
 
+declare void @bar(i32 %x) 
+
+define internal void @foo2(void (i32)* %f) !dbg !4 {
+entry:
+; CHECK-IL: call void @llvm.pseudoprobe(i64 [[#GUID2:]], i64 1, i32 0)
+; CHECK-MIR: PSEUDO_PROBE [[#GUID2:]], 1, 0, 0
+; Check pseudo_probe metadata attached to the indirect call instruction.
+; CHECK-IL: call void %f(i32 1), !dbg ![[#PROBE0:]]
+; CHECK-MIR: PSEUDO_PROBE [[#GUID2]], 2, 1, 0
+  call void %f(i32 1), !dbg !13
+; Check pseudo_probe metadata attached to the direct call instruction.
+; CHECK-IL: call void @bar(i32 1), !dbg ![[#PROBE1:]]
+; CHECK-MIR: PSEUDO_PROBE	[[#GUID2]], 3, 2, 0
+  call void @bar(i32 1)
+  ret void
+}
+
 ; CHECK-IL: ![[#FOO:]] = distinct !DISubprogram(name: "foo"
 ; CHECK-IL: ![[#FAKELINE]] = !DILocation(line: 0, scope: ![[#FOO]])
 ; CHECK-IL: ![[#REALLINE]] = !DILocation(line: 2, scope: ![[#FOO]])
+; CHECK-IL: ![[#PROBE0]] = !DILocation(line: 2, column: 20, scope: ![[#SCOPE0:]])
+;; A discriminator of 67108887 which is 0x4000017 in hexdecimal, stands for a direct call probe
+;; with an index of 2.
+; CHECK-IL: ![[#SCOPE0]] = !DILexicalBlockFile(scope: ![[#]], file: ![[#]], discriminator: 67108887)
+; CHECK-IL: ![[#PROBE1]] = !DILocation(line: 0, scope: ![[#SCOPE1:]])
+;; A discriminator of 134217759 which is 0x800001f in hexdecimal, stands for a direct call probe
+;; with an index of 3.
+; CHECK-IL: ![[#SCOPE1]] = !DILexicalBlockFile(scope: ![[#]], file: ![[#]], discriminator: 134217759)
+
 
 !llvm.dbg.cu = !{!0}
 !llvm.module.flags = !{!9, !10}
@@ -40,10 +66,12 @@ bb3:
 !1 = !DIFile(filename: "test.c", directory: "")
 !2 = !{}
 !3 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 1, type: !5, unit: !0, retainedNodes: !2)
+!4 = distinct !DISubprogram(name: "foo2", scope: !1, file: !1, line: 2, type: !5, unit: !0, retainedNodes: !2)
 !5 = !DISubroutineType(types: !6)
 !6 = !{!7}
 !7 = !DIBasicType(name: "int", size: 32, align: 32, encoding: DW_ATE_signed)
 !9 = !{i32 2, !"Dwarf Version", i32 4}
 !10 = !{i32 2, !"Debug Info Version", i32 3}
 !11 = !{!"clang version 3.9.0"}
-!12 = !DILocation(line: 2, scope: !3)
\ No newline at end of file
+!12 = !DILocation(line: 2, scope: !3)
+!13 = !DILocation(line: 2, column: 20, scope: !4)


        


More information about the llvm-branch-commits mailing list