[llvm-branch-commits] [llvm] b3154d1 - [CSSPGO][llvm-profgen] Pseudo probe decoding and disassembling

via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Wed Jan 13 11:11:28 PST 2021


Author: wlei
Date: 2021-01-13T11:02:57-08:00
New Revision: b3154d11bc6dee59e581b731b7561f1ebab3aed6

URL: https://github.com/llvm/llvm-project/commit/b3154d11bc6dee59e581b731b7561f1ebab3aed6
DIFF: https://github.com/llvm/llvm-project/commit/b3154d11bc6dee59e581b731b7561f1ebab3aed6.diff

LOG: [CSSPGO][llvm-profgen] Pseudo probe decoding and disassembling

This change implements pseudo probe decoding and disassembling for llvm-profgen/CSSPGO. Please see https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s and https://reviews.llvm.org/D89707 for more context about CSSPGO and llvm-profgen.

**ELF section format**
Please see the encoding patch(https://reviews.llvm.org/D91878) for more details of the format, just copy the example here:

Two section(`.pseudo_probe_desc` and  `.pseudoprobe` ) is emitted in ELF to support pseudo probe.
The format of `.pseudo_probe_desc` section looks like:

```
.section   .pseudo_probe_desc,"", at progbits
.quad   6309742469962978389  // Func GUID
.quad   4294967295           // Func Hash
.byte   9                    // Length of func name
.ascii  "_Z5funcAi"          // Func name
.quad   7102633082150537521
.quad   138828622701
.byte   12
.ascii  "_Z8funcLeafi"
.quad   446061515086924981
.quad   4294967295
.byte   9
.ascii  "_Z5funcBi"
.quad   -2016976694713209516
.quad   72617220756
.byte   7
.ascii  "_Z3fibi"
```

For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format :

```
FUNCTION BODY (one for each outlined function present in the text section)
    GUID (uint64)
        GUID of the function
    NPROBES (ULEB128)
        Number of probes originating from this function.
    NUM_INLINED_FUNCTIONS (ULEB128)
        Number of callees inlined into this function, aka number of
        first-level inlinees
    PROBE RECORDS
        A list of NPROBES entries. Each entry contains:
          INDEX (ULEB128)
          TYPE (uint4)
            0 - block probe, 1 - indirect call, 2 - direct call
          ATTRIBUTE (uint3)
            reserved
          ADDRESS_TYPE (uint1)
            0 - code address, 1 - address delta
          CODE_ADDRESS (uint64 or ULEB128)
            code address or address delta, depending on ADDRESS_TYPE
    INLINED FUNCTION RECORDS
        A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
        callees.  Each record contains:
          INLINE SITE
            GUID of the inlinee (uint64)
            ID of the callsite probe (ULEB128)
          FUNCTION BODY
            A FUNCTION BODY entry describing the inlined function.
```

**Disassembling**
A switch `--show-pseudo-probe` is added to use along with `--show-disassembly` to print disassembly code with pseudo probe directives.

For example:
```
00000000002011a0 <foo2>:
  2011a0: 50                    push   rax
  2011a1: 85 ff                 test   edi,edi
  [Probe]:  FUNC: foo2  Index: 1  Type: Block
  2011a3: 74 02                 je     2011a7 <foo2+0x7>
  [Probe]:  FUNC: foo2  Index: 3  Type: Block
  [Probe]:  FUNC: foo2  Index: 4  Type: Block
  [Probe]:  FUNC: foo   Index: 1  Type: Block  Inlined: @ foo2:6
  2011a5: 58                    pop    rax
  2011a6: c3                    ret
  [Probe]:  FUNC: foo2  Index: 2  Type: Block
  2011a7: bf 01 00 00 00        mov    edi,0x1
  [Probe]:  FUNC: foo2  Index: 5  Type: IndirectCall
  2011ac: ff d6                 call   rsi
  [Probe]:  FUNC: foo2  Index: 4  Type: Block
  2011ae: 58                    pop    rax
  2011af: c3                    ret
```

**Implementation**
- `PseudoProbeDecoder` is added in ProfiledBinary as an infra for the decoding. It decoded the two section and generate two map: `GUIDProbeFunctionMap` stores all the `PseudoProbeFunction` which is the abstraction of a general function. `AddressProbesMap` stores all the pseudo probe info indexed by its address.
- All the inline info is encoded into binary as a trie(`PseudoProbeInlineTree`) and will be constructed from the decoding. Each pseudo probe can get its inline context(`getInlineContext`) by traversing its inline tree node backwards.

Test Plan:
ninja & ninja check-llvm

Differential Revision: https://reviews.llvm.org/D92334

Added: 
    llvm/test/tools/llvm-profgen/Inputs/inline-cs-pseudoprobe.perfbin
    llvm/test/tools/llvm-profgen/pseudoprobe-decoding.test
    llvm/tools/llvm-profgen/PseudoProbe.cpp
    llvm/tools/llvm-profgen/PseudoProbe.h

Modified: 
    llvm/tools/llvm-profgen/CMakeLists.txt
    llvm/tools/llvm-profgen/ProfiledBinary.cpp
    llvm/tools/llvm-profgen/ProfiledBinary.h

Removed: 
    


################################################################################
diff  --git a/llvm/test/tools/llvm-profgen/Inputs/inline-cs-pseudoprobe.perfbin b/llvm/test/tools/llvm-profgen/Inputs/inline-cs-pseudoprobe.perfbin
new file mode 100755
index 000000000000..2b5fc0a9dfdd
Binary files /dev/null and b/llvm/test/tools/llvm-profgen/Inputs/inline-cs-pseudoprobe.perfbin 
diff er

diff  --git a/llvm/test/tools/llvm-profgen/pseudoprobe-decoding.test b/llvm/test/tools/llvm-profgen/pseudoprobe-decoding.test
new file mode 100644
index 000000000000..5feaa97032ab
--- /dev/null
+++ b/llvm/test/tools/llvm-profgen/pseudoprobe-decoding.test
@@ -0,0 +1,121 @@
+; RUN: llvm-profgen --perfscript=%s  --binary=%S/Inputs/inline-cs-pseudoprobe.perfbin --output=%t --show-pseudo-probe --show-disassembly | FileCheck %s
+
+PERF_RECORD_MMAP2 2854748/2854748: [0x400000(0x1000) @ 0 00:1d 123291722 526021]: r-xp /home/inline-cs-pseudoprobe.perfbin
+
+; CHECK: Pseudo Probe Desc:
+; CHECK: GUID: 6699318081062747564 Name: foo
+; CHECK: Hash: 138950591924
+; CHECK: GUID: 15822663052811949562 Name: main
+; CHECK: Hash: 4294967295
+; CHECK: GUID: 16434608426314478903 Name: bar
+; CHECK: Hash: 72617220756
+
+
+
+; CHECK:      <bar>:
+
+; CHECK:       [Probe]: FUNC: bar Index: 1  Type: Block
+; CHECK-NEXT:      754: imull $2863311531, %edi, %eax
+
+; CHECK:       [Probe]: FUNC: bar Index: 2  Type: Block  Dangling
+; CHECK-NEXT:  [Probe]: FUNC: bar Index: 3  Type: Block  Dangling
+; CHECK-NEXT:      768: cmovbl  %esi, %ecx
+
+; CHECK:       [Probe]: FUNC: bar Index: 4  Type: Block
+; CHECK-NEXT:      76e: popq  %rbp
+
+
+; CHECK:      <foo>:
+; CHECK:       [Probe]: FUNC: foo Index: 1  Type: Block
+; CHECK-NEXT:  [Probe]: FUNC: foo Index: 2  Type: Block
+; CHECK-NEXT:      770: movl  $1, %ecx
+
+; CHECK:       [Probe]: FUNC: foo Index: 5  Type: Block
+; CHECK-NEXT:      780: addl  $30, %esi
+; CHECK:       [Probe]: FUNC: foo Index: 6  Type: Block
+; CHECK-NEXT:  [Probe]: FUNC: foo Index: 2  Type: Block
+; CHECK-NEXT:      783: addl  $1, %ecx
+
+; CHECK:       [Probe]: FUNC: foo Index: 3  Type: Block
+; CHECK-NEXT:      7a9: cmpl  %eax, %ecx
+
+; CHECK:       [Probe]: FUNC: foo Index: 4  Type: Block
+; CHECK-NEXT:  [Probe]: FUNC: bar Index: 1  Type: Block  Inlined: @ foo:8
+; CHECK-NEXT:      7bf: addl  %ecx, %edx
+
+; CHECK:       [Probe]: FUNC: bar Index: 2  Type: Block  Dangling  Inlined: @ foo:8
+; CHECK-NEXT:  [Probe]: FUNC: bar Index: 3  Type: Block  Dangling  Inlined: @ foo:8
+; CHECK-NEXT:      7c8: cmovel  %esi, %eax
+
+; CHECK:       [Probe]: FUNC: bar Index: 4  Type: Block  Inlined: @ foo:8
+; CHECK-NEXT:      7cd: movl  %eax, %esi
+; CHECK:       [Probe]: FUNC: foo Index: 6  Type: Block
+; CHECK-NEXT:  [Probe]: FUNC: foo Index: 2  Type: Block
+
+; CHECK:       [Probe]: FUNC: foo Index: 7  Type: Block
+; CHECK-NEXT:      7de: movl  $2098432, %edi
+
+; CHECK:       [Probe]: FUNC: foo Index: 9  Type: DirectCall
+; CHECK-NEXT:      7e5: callq 0x930
+
+
+; CHECK:      <main>:
+; CHECK:       [Probe]: FUNC: main Index: 1  Type: Block
+; CHECK-NEXT:  [Probe]: FUNC: foo Index: 1  Type: Block  Inlined: @ main:2
+; CHECK-NEXT:  [Probe]: FUNC: foo Index: 2  Type: Block  Inlined: @ main:2
+; CHECK-NEXT:      7f0: movl  $1, %ecx
+
+; CHECK:       [Probe]: FUNC: foo Index: 5  Type: Block  Inlined: @ main:2
+; CHECK-NEXT:      800: addl  $30, %esi
+; CHECK:       [Probe]: FUNC: foo Index: 6  Type: Block  Inlined: @ main:2
+; CHECK-NEXT:  [Probe]: FUNC: foo Index: 2  Type: Block  Inlined: @ main:2
+; CHECK-NEXT:      803: addl  $1, %ecx
+
+; CHECK:       [Probe]: FUNC: foo Index: 3  Type: Block  Inlined: @ main:2
+; CHECK-NEXT:      829: cmpl  %eax, %ecx
+
+; CHECK:       [Probe]: FUNC: foo Index: 4  Type: Block  Inlined: @ main:2
+; CHECK-NEXT:  [Probe]: FUNC: bar Index: 1  Type: Block  Inlined: @ main:2 @ foo:8
+; CHECK-NEXT:      83f: addl  %ecx, %edx
+
+; CHECK:       [Probe]: FUNC: bar Index: 2  Type: Block  Dangling  Inlined: @ main:2 @ foo:8
+; CHECK-NEXT:  [Probe]: FUNC: bar Index: 3  Type: Block  Dangling  Inlined: @ main:2 @ foo:8
+; CHECK-NEXT:      848: cmovel  %esi, %eax
+
+; CHECK:       [Probe]: FUNC: bar Index: 4  Type: Block  Inlined: @ main:2 @ foo:8
+; CHECK-NEXT:      84d: movl  %eax, %esi
+; CHECK:       [Probe]: FUNC: foo Index: 6  Type: Block  Inlined: @ main:2
+; CHECK-NEXT:  [Probe]: FUNC: foo Index: 2  Type: Block  Inlined: @ main:2
+; CHECK-NEXT:      84f: addl  $1, %ecx
+
+; CHECK:       [Probe]: FUNC: foo Index: 7  Type: Block  Inlined: @ main:2
+; CHECK-NEXT:      85e: movl  $2098432, %edi
+
+; CHECK:       [Probe]: FUNC: foo Index: 9  Type: DirectCall  Inlined: @ main:2
+; CHECK-NEXT:      865: callq 0x930
+
+
+; clang -O3 -fexperimental-new-pass-manager -fuse-ld=lld -fpseudo-probe-for-profiling
+; -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -Xclang -mdisable-tail-calls
+; -g test.c  -o a.out
+
+#include <stdio.h>
+
+int bar(int x, int y) {
+  if (x % 3) {
+    return x - y;
+  }
+  return x + y;
+}
+
+void foo() {
+  int s, i = 0;
+  while (i++ < 4000 * 4000)
+    if (i % 91) s = bar(i, s); else s += 30;
+  printf("sum is %d\n", s);
+}
+
+int main() {
+  foo();
+  return 0;
+}

diff  --git a/llvm/tools/llvm-profgen/CMakeLists.txt b/llvm/tools/llvm-profgen/CMakeLists.txt
index 4379a8b0e34e..e7705eb21c9f 100644
--- a/llvm/tools/llvm-profgen/CMakeLists.txt
+++ b/llvm/tools/llvm-profgen/CMakeLists.txt
@@ -17,4 +17,5 @@ add_llvm_tool(llvm-profgen
   PerfReader.cpp
   ProfiledBinary.cpp
   ProfileGenerator.cpp
+  PseudoProbe.cpp
   )

diff  --git a/llvm/tools/llvm-profgen/ProfiledBinary.cpp b/llvm/tools/llvm-profgen/ProfiledBinary.cpp
index 37b71fd42d70..96080e9298f3 100644
--- a/llvm/tools/llvm-profgen/ProfiledBinary.cpp
+++ b/llvm/tools/llvm-profgen/ProfiledBinary.cpp
@@ -29,6 +29,10 @@ static cl::opt<bool> ShowSourceLocations("show-source-locations",
                                          cl::ZeroOrMore,
                                          cl::desc("Print source locations."));
 
+static cl::opt<bool> ShowPseudoProbe(
+    "show-pseudo-probe", cl::ReallyHidden, cl::init(false), cl::ZeroOrMore,
+    cl::desc("Print pseudo probe section and disassembled info."));
+
 namespace llvm {
 namespace sampleprof {
 
@@ -93,6 +97,9 @@ void ProfiledBinary::load() {
   // Find the preferred base address for text sections.
   setPreferredBaseAddress(Obj);
 
+  // Decode pseudo probe related section
+  decodePseudoProbe(Obj);
+
   // Disassemble the text sections.
   disassemble(Obj);
 
@@ -165,6 +172,28 @@ void ProfiledBinary::setPreferredBaseAddress(const ELFObjectFileBase *Obj) {
   exitWithError("no text section found", Obj->getFileName());
 }
 
+void ProfiledBinary::decodePseudoProbe(const ELFObjectFileBase *Obj) {
+  StringRef FileName = Obj->getFileName();
+  for (section_iterator SI = Obj->section_begin(), SE = Obj->section_end();
+       SI != SE; ++SI) {
+    const SectionRef &Section = *SI;
+    StringRef SectionName = unwrapOrError(Section.getName(), FileName);
+
+    if (SectionName == ".pseudo_probe_desc") {
+      StringRef Contents = unwrapOrError(Section.getContents(), FileName);
+      ProbeDecoder.buildGUID2FuncDescMap(
+          reinterpret_cast<const uint8_t *>(Contents.data()), Contents.size());
+    } else if (SectionName == ".pseudo_probe") {
+      StringRef Contents = unwrapOrError(Section.getContents(), FileName);
+      ProbeDecoder.buildAddress2ProbeMap(
+          reinterpret_cast<const uint8_t *>(Contents.data()), Contents.size());
+    }
+  }
+
+  if (ShowPseudoProbe)
+    ProbeDecoder.printGUID2FuncDescMap(outs());
+}
+
 bool ProfiledBinary::dissassembleSymbol(std::size_t SI, ArrayRef<uint8_t> Bytes,
                                         SectionSymbolsTy &Symbols,
                                         const SectionRef &Section) {
@@ -193,6 +222,10 @@ bool ProfiledBinary::dissassembleSymbol(std::size_t SI, ArrayRef<uint8_t> Bytes,
       return false;
 
     if (ShowDisassembly) {
+      if (ShowPseudoProbe) {
+        ProbeDecoder.printProbeForAddress(outs(),
+                                          Offset + PreferredBaseAddress);
+      }
       outs() << format("%8" PRIx64 ":", Offset);
       size_t Start = outs().tell();
       IPrinter->printInst(&Inst, Offset + Size, "", *STI.get(), outs());

diff  --git a/llvm/tools/llvm-profgen/ProfiledBinary.h b/llvm/tools/llvm-profgen/ProfiledBinary.h
index add1a2269cda..6f32933631c5 100644
--- a/llvm/tools/llvm-profgen/ProfiledBinary.h
+++ b/llvm/tools/llvm-profgen/ProfiledBinary.h
@@ -10,6 +10,7 @@
 #define LLVM_TOOLS_LLVM_PROFGEN_PROFILEDBINARY_H
 
 #include "CallContext.h"
+#include "PseudoProbe.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/DebugInfo/Symbolize/Symbolize.h"
 #include "llvm/MC/MCAsmInfo.h"
@@ -128,8 +129,14 @@ class ProfiledBinary {
 
   // The symbolizer used to get inline context for an instruction.
   std::unique_ptr<symbolize::LLVMSymbolizer> Symbolizer;
+
+  // Pseudo probe decoder
+  PseudoProbeDecoder ProbeDecoder;
+
   void setPreferredBaseAddress(const ELFObjectFileBase *O);
 
+  void decodePseudoProbe(const ELFObjectFileBase *Obj);
+
   // Set up disassembler and related components.
   void setUpDisassembler(const ELFObjectFileBase *Obj);
   void setupSymbolizer();

diff  --git a/llvm/tools/llvm-profgen/PseudoProbe.cpp b/llvm/tools/llvm-profgen/PseudoProbe.cpp
new file mode 100644
index 000000000000..7c683315032e
--- /dev/null
+++ b/llvm/tools/llvm-profgen/PseudoProbe.cpp
@@ -0,0 +1,297 @@
+//===--- PseudoProbe.cpp - Pseudo probe decoding utilities  ------*- C++-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "PseudoProbe.h"
+#include "ErrorHandling.h"
+#include "llvm/Support/Endian.h"
+#include "llvm/Support/LEB128.h"
+#include "llvm/Support/raw_ostream.h"
+#include <limits>
+#include <memory>
+
+using namespace llvm;
+using namespace sampleprof;
+using namespace support;
+
+namespace llvm {
+namespace sampleprof {
+
+static StringRef getProbeFNameForGUID(const GUIDProbeFunctionMap &GUID2FuncMAP,
+                                      uint64_t GUID) {
+  auto It = GUID2FuncMAP.find(GUID);
+  assert(It != GUID2FuncMAP.end() &&
+         "Probe function must exist for a valid GUID");
+  return It->second.FuncName;
+}
+
+void PseudoProbeFuncDesc::print(raw_ostream &OS) {
+  OS << "GUID: " << FuncGUID << " Name: " << FuncName << "\n";
+  OS << "Hash: " << FuncHash << "\n";
+}
+
+void PseudoProbe::getInlineContext(SmallVector<std::string, 16> &ContextStack,
+                                   const GUIDProbeFunctionMap &GUID2FuncMAP,
+                                   bool ShowName) const {
+  uint32_t Begin = ContextStack.size();
+  PseudoProbeInlineTree *Cur = InlineTree;
+  // It will add the string of each node's inline site during iteration.
+  // Note that it won't include the probe's belonging function(leaf location)
+  while (!Cur->hasInlineSite()) {
+    std::string ContextStr;
+    if (ShowName) {
+      StringRef FuncName =
+          getProbeFNameForGUID(GUID2FuncMAP, std::get<0>(Cur->ISite));
+      ContextStr += FuncName.str();
+    } else {
+      ContextStr += Twine(std::get<0>(Cur->ISite)).str();
+    }
+    ContextStr += ":";
+    ContextStr += Twine(std::get<1>(Cur->ISite)).str();
+    ContextStack.emplace_back(ContextStr);
+    Cur = Cur->Parent;
+  }
+  // Make the ContextStack in caller-callee order
+  std::reverse(ContextStack.begin() + Begin, ContextStack.end());
+}
+
+std::string
+PseudoProbe::getInlineContextStr(const GUIDProbeFunctionMap &GUID2FuncMAP,
+                                 bool ShowName) const {
+  std::ostringstream OContextStr;
+  SmallVector<std::string, 16> ContextStack;
+  getInlineContext(ContextStack, GUID2FuncMAP, ShowName);
+  for (auto &CxtStr : ContextStack) {
+    if (OContextStr.str().size())
+      OContextStr << " @ ";
+    OContextStr << CxtStr;
+  }
+  return OContextStr.str();
+}
+
+static const char *PseudoProbeTypeStr[3] = {"Block", "IndirectCall",
+                                            "DirectCall"};
+
+void PseudoProbe::print(raw_ostream &OS,
+                        const GUIDProbeFunctionMap &GUID2FuncMAP,
+                        bool ShowName) {
+  OS << "FUNC: ";
+  if (ShowName) {
+    StringRef FuncName = getProbeFNameForGUID(GUID2FuncMAP, GUID);
+    OS << FuncName.str() << " ";
+  } else {
+    OS << GUID << " ";
+  }
+  OS << "Index: " << Index << "  ";
+  OS << "Type: " << PseudoProbeTypeStr[static_cast<uint8_t>(Type)] << "  ";
+  if (isDangling()) {
+    OS << "Dangling  ";
+  }
+  if (isTailCall()) {
+    OS << "TailCall  ";
+  }
+  std::string InlineContextStr = getInlineContextStr(GUID2FuncMAP, ShowName);
+  if (InlineContextStr.size()) {
+    OS << "Inlined: @ ";
+    OS << InlineContextStr;
+  }
+  OS << "\n";
+}
+
+template <typename T> T PseudoProbeDecoder::readUnencodedNumber() {
+  if (Data + sizeof(T) > End) {
+    exitWithError("Decode unencoded number error in " + SectionName +
+                  " section");
+  }
+  T Val = endian::readNext<T, little, unaligned>(Data);
+  return Val;
+}
+
+template <typename T> T PseudoProbeDecoder::readUnsignedNumber() {
+  unsigned NumBytesRead = 0;
+  uint64_t Val = decodeULEB128(Data, &NumBytesRead);
+  if (Val > std::numeric_limits<T>::max() || (Data + NumBytesRead > End)) {
+    exitWithError("Decode number error in " + SectionName + " section");
+  }
+  Data += NumBytesRead;
+  return static_cast<T>(Val);
+}
+
+template <typename T> T PseudoProbeDecoder::readSignedNumber() {
+  unsigned NumBytesRead = 0;
+  int64_t Val = decodeSLEB128(Data, &NumBytesRead);
+  if (Val > std::numeric_limits<T>::max() || (Data + NumBytesRead > End)) {
+    exitWithError("Decode number error in " + SectionName + " section");
+  }
+  Data += NumBytesRead;
+  return static_cast<T>(Val);
+}
+
+StringRef PseudoProbeDecoder::readString(uint32_t Size) {
+  StringRef Str(reinterpret_cast<const char *>(Data), Size);
+  if (Data + Size > End) {
+    exitWithError("Decode string error in " + SectionName + " section");
+  }
+  Data += Size;
+  return Str;
+}
+
+void PseudoProbeDecoder::buildGUID2FuncDescMap(const uint8_t *Start,
+                                               std::size_t Size) {
+  // The pseudo_probe_desc section has a format like:
+  // .section .pseudo_probe_desc,"", at progbits
+  // .quad -5182264717993193164   // GUID
+  // .quad 4294967295             // Hash
+  // .uleb 3                      // Name size
+  // .ascii "foo"                 // Name
+  // .quad -2624081020897602054
+  // .quad 174696971957
+  // .uleb 34
+  // .ascii "main"
+#ifndef NDEBUG
+  SectionName = "pseudo_probe_desc";
+#endif
+  Data = Start;
+  End = Data + Size;
+
+  while (Data < End) {
+    uint64_t GUID = readUnencodedNumber<uint64_t>();
+    uint64_t Hash = readUnencodedNumber<uint64_t>();
+    uint32_t NameSize = readUnsignedNumber<uint32_t>();
+    StringRef Name = readString(NameSize);
+
+    // Initialize PseudoProbeFuncDesc and populate it into GUID2FuncDescMap
+    GUID2FuncDescMap.emplace(GUID, PseudoProbeFuncDesc(GUID, Hash, Name));
+  }
+  assert(Data == End && "Have unprocessed data in pseudo_probe_desc section");
+}
+
+void PseudoProbeDecoder::buildAddress2ProbeMap(const uint8_t *Start,
+                                               std::size_t Size) {
+  // The pseudo_probe section encodes an inline forest and each tree has a
+  // format like:
+  //  FUNCTION BODY (one for each uninlined function present in the text
+  //  section)
+  //     GUID (uint64)
+  //         GUID of the function
+  //     NPROBES (ULEB128)
+  //         Number of probes originating from this function.
+  //     NUM_INLINED_FUNCTIONS (ULEB128)
+  //         Number of callees inlined into this function, aka number of
+  //         first-level inlinees
+  //     PROBE RECORDS
+  //         A list of NPROBES entries. Each entry contains:
+  //           INDEX (ULEB128)
+  //           TYPE (uint4)
+  //             0 - block probe, 1 - indirect call, 2 - direct call
+  //           ATTRIBUTE (uint3)
+  //             1 - tail call, 2 - dangling
+  //           ADDRESS_TYPE (uint1)
+  //             0 - code address, 1 - address delta
+  //           CODE_ADDRESS (uint64 or ULEB128)
+  //             code address or address delta, depending on Flag
+  //     INLINED FUNCTION RECORDS
+  //         A list of NUM_INLINED_FUNCTIONS entries describing each of the
+  //         inlined callees.  Each record contains:
+  //           INLINE SITE
+  //             GUID of the inlinee (uint64)
+  //             Index of the callsite probe (ULEB128)
+  //           FUNCTION BODY
+  //             A FUNCTION BODY entry describing the inlined function.
+#ifndef NDEBUG
+  SectionName = "pseudo_probe";
+#endif
+  Data = Start;
+  End = Data + Size;
+
+  PseudoProbeInlineTree *Root = &DummyInlineRoot;
+  PseudoProbeInlineTree *Cur = &DummyInlineRoot;
+  uint64_t LastAddr = 0;
+  uint32_t Index = 0;
+  // A DFS-based decoding
+  while (Data < End) {
+    // Read inline site for inlinees
+    if (Root != Cur) {
+      Index = readUnsignedNumber<uint32_t>();
+    }
+    // Switch/add to a new tree node(inlinee)
+    Cur = Cur->getOrAddNode({Cur->GUID, Index});
+    // Read guid
+    Cur->GUID = readUnencodedNumber<uint64_t>();
+    // Read number of probes in the current node.
+    uint32_t NodeCount = readUnsignedNumber<uint32_t>();
+    // Read number of direct inlinees
+    Cur->ChildrenToProcess = readUnsignedNumber<uint32_t>();
+    // Read all probes in this node
+    for (std::size_t I = 0; I < NodeCount; I++) {
+      // Read index
+      uint32_t Index = readUnsignedNumber<uint32_t>();
+      // Read type | flag.
+      uint8_t Value = readUnencodedNumber<uint8_t>();
+      uint8_t Kind = Value & 0xf;
+      uint8_t Attr = (Value & 0x70) >> 4;
+      // Read address
+      uint64_t Addr = 0;
+      if (Value & 0x80) {
+        int64_t Offset = readSignedNumber<int64_t>();
+        Addr = LastAddr + Offset;
+      } else {
+        Addr = readUnencodedNumber<int64_t>();
+      }
+      // Populate Address2ProbesMap
+      std::vector<PseudoProbe> &ProbeVec = Address2ProbesMap[Addr];
+      ProbeVec.emplace_back(Addr, Cur->GUID, Index, PseudoProbeType(Kind), Attr,
+                            Cur);
+      Cur->addProbes(&ProbeVec.back());
+      LastAddr = Addr;
+    }
+
+    // Look for the parent for the next node by subtracting the current
+    // node count from tree counts along the parent chain. The first node
+    // in the chain that has a non-zero tree count is the target.
+    while (Cur != Root) {
+      if (Cur->ChildrenToProcess == 0) {
+        Cur = Cur->Parent;
+        if (Cur != Root) {
+          assert(Cur->ChildrenToProcess > 0 &&
+                 "Should have some unprocessed nodes");
+          Cur->ChildrenToProcess -= 1;
+        }
+      } else {
+        break;
+      }
+    }
+  }
+
+  assert(Data == End && "Have unprocessed data in pseudo_probe section");
+  assert(Cur == Root &&
+         " Cur should point to root when the forest is fully built up");
+}
+
+void PseudoProbeDecoder::printGUID2FuncDescMap(raw_ostream &OS) {
+  OS << "Pseudo Probe Desc:\n";
+  // Make the output deterministic
+  std::map<uint64_t, PseudoProbeFuncDesc> OrderedMap(GUID2FuncDescMap.begin(),
+                                                     GUID2FuncDescMap.end());
+  for (auto &I : OrderedMap) {
+    I.second.print(OS);
+  }
+}
+
+void PseudoProbeDecoder::printProbeForAddress(raw_ostream &OS,
+                                              uint64_t Address) {
+  auto It = Address2ProbesMap.find(Address);
+  if (It != Address2ProbesMap.end()) {
+    for (auto &Probe : It->second) {
+      OS << " [Probe]:\t";
+      Probe.print(OS, GUID2FuncDescMap, true);
+    }
+  }
+}
+
+} // end namespace sampleprof
+} // end namespace llvm

diff  --git a/llvm/tools/llvm-profgen/PseudoProbe.h b/llvm/tools/llvm-profgen/PseudoProbe.h
new file mode 100644
index 000000000000..8a5f3cf441e1
--- /dev/null
+++ b/llvm/tools/llvm-profgen/PseudoProbe.h
@@ -0,0 +1,209 @@
+//===--- PseudoProbe.h - Pseudo probe decoding utilities ---------*- C++-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+#ifndef LLVM_TOOLS_LLVM_PROFGEN_PSEUDOPROBE_H
+#define LLVM_TOOLS_LLVM_PROFGEN_PSEUDOPROBE_H
+
+#include "llvm/ADT/StringRef.h"
+#include "llvm/ADT/Twine.h"
+#include "llvm/IR/PseudoProbe.h"
+#include "llvm/Support/raw_ostream.h"
+#include "llvm/Transforms/IPO/SampleProfileProbe.h"
+#include <algorithm>
+#include <set>
+#include <sstream>
+#include <string>
+#include <unordered_map>
+#include <unordered_set>
+#include <vector>
+
+namespace llvm {
+namespace sampleprof {
+
+enum PseudoProbeAttributes { TAILCALL = 1, DANGLING = 2 };
+
+// Use func GUID and index as the location info of the inline site
+using InlineSite = std::tuple<uint64_t, uint32_t>;
+
+struct PseudoProbe;
+
+// Tree node to represent the inline relation and its inline site, we use a
+// dummy root in the PseudoProbeDecoder to lead the tree, the outlined
+// function will directly be the children of the dummy root. For the inlined
+// function, all the inlinee will be connected to its inlineer, then further to
+// its outlined function. Pseudo probes originating from the function stores the
+// tree's leaf node which we can process backwards to get its inline context
+class PseudoProbeInlineTree {
+  std::vector<PseudoProbe *> ProbeVector;
+
+  struct InlineSiteHash {
+    uint64_t operator()(const InlineSite &Site) const {
+      return std::get<0>(Site) ^ std::get<1>(Site);
+    }
+  };
+  std::unordered_map<InlineSite, std::unique_ptr<PseudoProbeInlineTree>,
+                     InlineSiteHash>
+      Children;
+
+public:
+  // Inlinee function GUID
+  uint64_t GUID = 0;
+  // Inline site to indicate the location in its inliner. As the node could also
+  // be an outlined function, it will use a dummy InlineSite whose GUID and
+  // Index is 0 connected to the dummy root
+  InlineSite ISite;
+  // Used for decoding
+  uint32_t ChildrenToProcess = 0;
+  // Caller node of the inline site
+  PseudoProbeInlineTree *Parent;
+
+  PseudoProbeInlineTree(){};
+  PseudoProbeInlineTree(const InlineSite &Site) : ISite(Site){};
+
+  PseudoProbeInlineTree *getOrAddNode(const InlineSite &Site) {
+    auto Ret =
+        Children.emplace(Site, std::make_unique<PseudoProbeInlineTree>(Site));
+    Ret.first->second->Parent = this;
+    return Ret.first->second.get();
+  }
+
+  void addProbes(PseudoProbe *Probe) { ProbeVector.push_back(Probe); }
+  // Return false if it's a dummy inline site
+  bool hasInlineSite() const { return !std::get<0>(ISite); }
+};
+
+// Function descriptor decoded from .pseudo_probe_desc section
+struct PseudoProbeFuncDesc {
+  uint64_t FuncGUID = 0;
+  uint64_t FuncHash = 0;
+  std::string FuncName;
+
+  PseudoProbeFuncDesc(uint64_t GUID, uint64_t Hash, StringRef Name)
+      : FuncGUID(GUID), FuncHash(Hash), FuncName(Name){};
+
+  void print(raw_ostream &OS);
+};
+
+// GUID to PseudoProbeFuncDesc map
+using GUIDProbeFunctionMap = std::unordered_map<uint64_t, PseudoProbeFuncDesc>;
+// Address to pseudo probes map.
+using AddressProbesMap = std::unordered_map<uint64_t, std::vector<PseudoProbe>>;
+
+/*
+A pseudo probe has the format like below:
+  INDEX (ULEB128)
+  TYPE (uint4)
+    0 - block probe, 1 - indirect call, 2 - direct call
+  ATTRIBUTE (uint3)
+    1 - tail call, 2 - dangling
+  ADDRESS_TYPE (uint1)
+    0 - code address, 1 - address delta
+  CODE_ADDRESS (uint64 or ULEB128)
+  code address or address delta, depending on Flag
+*/
+struct PseudoProbe {
+  uint64_t Address;
+  uint64_t GUID;
+  uint32_t Index;
+  PseudoProbeType Type;
+  uint8_t Attribute;
+  PseudoProbeInlineTree *InlineTree;
+  const static uint32_t PseudoProbeFirstId =
+      static_cast<uint32_t>(PseudoProbeReservedId::Last) + 1;
+
+  PseudoProbe(uint64_t Ad, uint64_t G, uint32_t I, PseudoProbeType K,
+              uint8_t At, PseudoProbeInlineTree *Tree)
+      : Address(Ad), GUID(G), Index(I), Type(K), Attribute(At),
+        InlineTree(Tree){};
+
+  bool isEntry() const { return Index == PseudoProbeFirstId; }
+
+  bool isDangling() const {
+    return Attribute & static_cast<uint8_t>(PseudoProbeAttributes::DANGLING);
+  }
+
+  bool isTailCall() const {
+    return Attribute & static_cast<uint8_t>(PseudoProbeAttributes::TAILCALL);
+  }
+
+  bool isBlock() const { return Type == PseudoProbeType::Block; }
+  bool isIndirectCall() const { return Type == PseudoProbeType::IndirectCall; }
+  bool isDirectCall() const { return Type == PseudoProbeType::DirectCall; }
+  bool isCall() const { return isIndirectCall() || isDirectCall(); }
+
+  // Get the inlined context by traversing current inline tree backwards,
+  // each tree node has its InlineSite which is taken as the context.
+  // \p ContextStack is populated in root to leaf order
+  void getInlineContext(SmallVector<std::string, 16> &ContextStack,
+                        const GUIDProbeFunctionMap &GUID2FuncMAP,
+                        bool ShowName) const;
+  // Helper function to get the string from context stack
+  std::string getInlineContextStr(const GUIDProbeFunctionMap &GUID2FuncMAP,
+                                  bool ShowName) const;
+  // Print pseudo probe while disassembling
+  void print(raw_ostream &OS, const GUIDProbeFunctionMap &GUID2FuncMAP,
+             bool ShowName);
+};
+
+/*
+Decode pseudo probe info from ELF section, used along with ELF reader
+Two sections are decoded here:
+  1) \fn buildGUID2FunctionMap is responsible for .pseudo_probe_desc
+  section which encodes all function descriptors.
+  2) \fn buildAddress2ProbeMap is responsible for .pseudoprobe section
+    which encodes an inline function forest and each tree includes its
+    inlined function and all pseudo probes inside the function.
+see \file MCPseudoProbe.h for the details of the section encoding format.
+*/
+class PseudoProbeDecoder {
+  // GUID to PseudoProbeFuncDesc map.
+  GUIDProbeFunctionMap GUID2FuncDescMap;
+
+  // Address to probes map.
+  AddressProbesMap Address2ProbesMap;
+
+  // The dummy root of the inline trie, all the outlined function will directly
+  // be the children of the dummy root, all the inlined function will be the
+  // children of its inlineer. So the relation would be like:
+  // DummyRoot --> OutlinedFunc --> InlinedFunc1 --> InlinedFunc2
+  PseudoProbeInlineTree DummyInlineRoot;
+
+  /// Points to the current location in the buffer.
+  const uint8_t *Data = nullptr;
+
+  /// Points to the end of the buffer.
+  const uint8_t *End = nullptr;
+
+#ifndef NDEBUG
+  /// SectionName used for debug
+  std::string SectionName;
+#endif
+
+  // Decoding helper function
+  template <typename T> T readUnencodedNumber();
+  template <typename T> T readUnsignedNumber();
+  template <typename T> T readSignedNumber();
+  StringRef readString(uint32_t Size);
+
+public:
+  // Decode pseudo_probe_desc section to build GUID to PseudoProbeFuncDesc map.
+  void buildGUID2FuncDescMap(const uint8_t *Start, std::size_t Size);
+
+  // Decode pseudo_probe section to build address to probes map.
+  void buildAddress2ProbeMap(const uint8_t *Start, std::size_t Size);
+
+  // Print pseudo_probe_desc section info
+  void printGUID2FuncDescMap(raw_ostream &OS);
+
+  // Print pseudo_probe section info, used along with show-disassembly
+  void printProbeForAddress(raw_ostream &OS, uint64_t Address);
+};
+
+} // end namespace sampleprof
+} // end namespace llvm
+
+#endif


        


More information about the llvm-branch-commits mailing list