[llvm] 7df28fd - [SHT_LLVM_BB_ADDR_MAP][AsmPrinter] Implements PGOAnalysisMap emitting in AsmPrinter with tests. (#75202)

via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 3 16:17:48 PST 2024


Author: Micah Weston
Date: 2024-01-03T19:17:44-05:00
New Revision: 7df28fd61aa4603846b3ce16f9f988ccc780a584

URL: https://github.com/llvm/llvm-project/commit/7df28fd61aa4603846b3ce16f9f988ccc780a584
DIFF: https://github.com/llvm/llvm-project/commit/7df28fd61aa4603846b3ce16f9f988ccc780a584.diff

LOG: [SHT_LLVM_BB_ADDR_MAP][AsmPrinter] Implements PGOAnalysisMap emitting in AsmPrinter with tests. (#75202)

Uses machine analyses to emit PGOAnalysisMap into the bb-addr-map ELF
section. Implements filecheck tests to verify emitting new fields.

This patch emits optional PGO related analyses into the bb-addr-map ELF
section during AsmPrinter. This currently supports Function Entry Count,
Machine Block Frequencies. and Machine Branch Probabilities. Each is
independently enabled via the `feature` byte of `bb-addr-map` for the given
function.

A part of [RFC - PGO Accuracy Metrics: Emitting and Evaluating Branch and Block Analysis](https://discourse.llvm.org/t/rfc-pgo-accuracy-metrics-emitting-and-evaluating-branch-and-block-analysis/73902).

Added: 
    llvm/test/CodeGen/X86/basic-block-sections-labels-pgo-features.ll

Modified: 
    llvm/docs/Extensions.rst
    llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
    llvm/test/CodeGen/X86/basic-block-sections-labels-empty-function.ll

Removed: 
    


################################################################################
diff  --git a/llvm/docs/Extensions.rst b/llvm/docs/Extensions.rst
index 6e94840897832a..74ca8cb0aa6879 100644
--- a/llvm/docs/Extensions.rst
+++ b/llvm/docs/Extensions.rst
@@ -451,6 +451,90 @@ Example:
    .uleb128  .LBB_END0_1-.LBB0_1          # BB_1 size
    .byte     y                            # BB_1 metadata
 
+PGO Analysis Map
+""""""""""""""""
+
+PGO related analysis data can be emitted after each function within the
+``SHT_LLVM_BB_ADDR_MAP`` through the optional ``pgo-analysis-map`` flag.
+Supported analyses currently are Function Entry Count, Basic Block Frequencies,
+and Branch Probabilities.
+
+Each analysis is enabled or disabled via a bit in the feature byte. Currently
+those bits are:
+
+#. Function Entry Count - Number of times the function was called as taken
+   from a PGO profile. This will always be zero if PGO was not used or the
+   function was not encountered in the profile.
+
+#. Basic Block Frequencies - Encoded as raw block frequency value taken from
+   MBFI analysis. This value is an integer that encodes the relative frequency
+   compared to the entry block. More information can be found in
+   'llvm/Support/BlockFrequency.h'.
+
+#. Branch Probabilities - Encoded as raw numerator for branch probability
+   taken from MBPI analysis. This value is the numerator for a fixed point ratio
+   defined in 'llvm/Support/BranchProbability.h'. It indicates the probability
+   that the block is followed by a given successor block during execution.
+
+This extra data requires version 2 or above. This is necessary since successors
+of basic blocks won't know their index but will know their BB ID.
+
+Example of BBAddrMap with PGO data:
+
+.. code-block:: gas
+
+  .section  ".llvm_bb_addr_map","", at llvm_bb_addr_map
+  .byte     2                             # version number
+  .byte     7                             # feature byte - PGO analyses enabled mask
+  .quad     .Lfunc_begin0                 # address of the function
+  .uleb128  4                             # number of basic blocks
+  # BB record for BB_0
+   .uleb128  0                            # BB_0 BB ID
+   .uleb128  .Lfunc_begin0-.Lfunc_begin0  # BB_0 offset relative to function entry (always zero)
+   .uleb128  .LBB_END0_0-.Lfunc_begin0    # BB_0 size
+   .byte     0x18                         # BB_0 metadata (multiple successors)
+  # BB record for BB_1
+   .uleb128  1                            # BB_1 BB ID
+   .uleb128  .LBB0_1-.LBB_END0_0          # BB_1 offset relative to the end of last block (BB_0).
+   .uleb128  .LBB_END0_1-.LBB0_1          # BB_1 size
+   .byte     0x0                          # BB_1 metadata (two successors)
+  # BB record for BB_2
+   .uleb128  2                            # BB_2 BB ID
+   .uleb128  .LBB0_2-.LBB_END1_0          # BB_2 offset relative to the end of last block (BB_1).
+   .uleb128  .LBB_END0_2-.LBB0_2          # BB_2 size
+   .byte     0x0                          # BB_2 metadata (one successor)
+  # BB record for BB_3
+   .uleb128  3                            # BB_3 BB ID
+   .uleb128  .LBB0_3-.LBB_END0_2          # BB_3 offset relative to the end of last block (BB_2).
+   .uleb128  .LBB_END0_3-.LBB0_3          # BB_3 size
+   .byte     0x0                          # BB_3 metadata (zero successors)
+  # PGO Analysis Map
+  .uleb128  1000                          # function entry count (only when enabled)
+  # PGO data record for BB_0
+   .uleb128  1000                         # BB_0 basic block frequency (only when enabled)
+   .uleb128  3                            # BB_0 successors count (only enabled with branch probabilities)
+   .uleb128  1                            # BB_0 successor 1 BB ID (only enabled with branch probabilities)
+   .uleb128  0x22222222                   # BB_0 successor 1 branch probability (only enabled with branch probabilities)
+   .uleb128  2                            # BB_0 successor 2 BB ID (only enabled with branch probabilities)
+   .uleb128  0x33333333                   # BB_0 successor 2 branch probability (only enabled with branch probabilities)
+   .uleb128  3                            # BB_0 successor 3 BB ID (only enabled with branch probabilities)
+   .uleb128  0xaaaaaaaa                   # BB_0 successor 3 branch probability (only enabled with branch probabilities)
+  # PGO data record for BB_1
+   .uleb128  133                          # BB_1 basic block frequency (only when enabled)
+   .uleb128  2                            # BB_1 successors count (only enabled with branch probabilities)
+   .uleb128  2                            # BB_1 successor 1 BB ID (only enabled with branch probabilities)
+   .uleb128  0x11111111                   # BB_1 successor 1 branch probability (only enabled with branch probabilities)
+   .uleb128  3                            # BB_1 successor 2 BB ID (only enabled with branch probabilities)
+   .uleb128  0x11111111                   # BB_1 successor 2 branch probability (only enabled with branch probabilities)
+  # PGO data record for BB_2
+   .uleb128  18                           # BB_2 basic block frequency (only when enabled)
+   .uleb128  1                            # BB_2 successors count (only enabled with branch probabilities)
+   .uleb128  3                            # BB_2 successor 1 BB ID (only enabled with branch probabilities)
+   .uleb128  0xffffffff                   # BB_2 successor 1 branch probability (only enabled with branch probabilities)
+  # PGO data record for BB_3
+   .uleb128  1000                         # BB_3 basic block frequency (only when enabled)
+   .uleb128  0                            # BB_3 successors count (only enabled with branch probabilities)
+
 ``SHT_LLVM_OFFLOADING`` Section (offloading data)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 This section stores the binary data used to perform offloading device linking

diff  --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 4dd27702786e42..7df1c82bf357f6 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -40,6 +40,7 @@
 #include "llvm/CodeGen/GCMetadataPrinter.h"
 #include "llvm/CodeGen/LazyMachineBlockFrequencyInfo.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
+#include "llvm/CodeGen/MachineBranchProbabilityInfo.h"
 #include "llvm/CodeGen/MachineConstantPool.h"
 #include "llvm/CodeGen/MachineDominators.h"
 #include "llvm/CodeGen/MachineFrameInfo.h"
@@ -140,6 +141,26 @@ static cl::opt<std::string> BasicBlockProfileDump(
              "performed with -basic-block-sections=labels. Enabling this "
              "flag during in-process ThinLTO is not supported."));
 
+// This is a replication of fields of object::PGOAnalysisMap::Features. It
+// should match the order of the fields so that
+// `object::PGOAnalysisMap::Features::decode(PgoAnalysisMapFeatures.getBits())`
+// succeeds.
+enum class PGOMapFeaturesEnum {
+  FuncEntryCount,
+  BBFreq,
+  BrProb,
+};
+static cl::bits<PGOMapFeaturesEnum> PgoAnalysisMapFeatures(
+    "pgo-analysis-map", cl::Hidden, cl::CommaSeparated,
+    cl::values(clEnumValN(PGOMapFeaturesEnum::FuncEntryCount,
+                          "func-entry-count", "Function Entry Count"),
+               clEnumValN(PGOMapFeaturesEnum::BBFreq, "bb-freq",
+                          "Basic Block Frequency"),
+               clEnumValN(PGOMapFeaturesEnum::BrProb, "br-prob",
+                          "Branch Probability")),
+    cl::desc("Enable extended information within the BBAddrMap that is "
+             "extracted from PGO related analysis."));
+
 const char DWARFGroupName[] = "dwarf";
 const char DWARFGroupDescription[] = "DWARF Emission";
 const char DbgTimerName[] = "emit";
@@ -428,6 +449,7 @@ void AsmPrinter::getAnalysisUsage(AnalysisUsage &AU) const {
   AU.addRequired<MachineOptimizationRemarkEmitterPass>();
   AU.addRequired<GCModuleInfo>();
   AU.addRequired<LazyMachineBlockFrequencyInfoPass>();
+  AU.addRequired<MachineBranchProbabilityInfo>();
 }
 
 bool AsmPrinter::doInitialization(Module &M) {
@@ -1379,7 +1401,8 @@ void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {
   uint8_t BBAddrMapVersion = OutStreamer->getContext().getBBAddrMapVersion();
   OutStreamer->emitInt8(BBAddrMapVersion);
   OutStreamer->AddComment("feature");
-  OutStreamer->emitInt8(0);
+  auto FeaturesBits = static_cast<uint8_t>(PgoAnalysisMapFeatures.getBits());
+  OutStreamer->emitInt8(FeaturesBits);
   OutStreamer->AddComment("function address");
   OutStreamer->emitSymbolValue(FunctionSymbol, getPointerSize());
   OutStreamer->AddComment("number of basic blocks");
@@ -1409,6 +1432,51 @@ void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {
     OutStreamer->emitULEB128IntValue(getBBAddrMapMetadata(MBB));
     PrevMBBEndSymbol = MBB.getEndSymbol();
   }
+
+  if (FeaturesBits != 0) {
+    assert(BBAddrMapVersion >= 2 &&
+           "PGOAnalysisMap only supports version 2 or later");
+
+    auto FeatEnable =
+        cantFail(object::PGOAnalysisMap::Features::decode(FeaturesBits));
+
+    if (FeatEnable.FuncEntryCount) {
+      OutStreamer->AddComment("function entry count");
+      auto MaybeEntryCount = MF.getFunction().getEntryCount();
+      OutStreamer->emitULEB128IntValue(
+          MaybeEntryCount ? MaybeEntryCount->getCount() : 0);
+    }
+    const MachineBlockFrequencyInfo *MBFI =
+        FeatEnable.BBFreq
+            ? &getAnalysis<LazyMachineBlockFrequencyInfoPass>().getBFI()
+            : nullptr;
+    const MachineBranchProbabilityInfo *MBPI =
+        FeatEnable.BrProb ? &getAnalysis<MachineBranchProbabilityInfo>()
+                          : nullptr;
+
+    if (FeatEnable.BBFreq || FeatEnable.BrProb) {
+      for (const MachineBasicBlock &MBB : MF) {
+        if (FeatEnable.BBFreq) {
+          OutStreamer->AddComment("basic block frequency");
+          OutStreamer->emitULEB128IntValue(
+              MBFI->getBlockFreq(&MBB).getFrequency());
+        }
+        if (FeatEnable.BrProb) {
+          unsigned SuccCount = MBB.succ_size();
+          OutStreamer->AddComment("basic block successor count");
+          OutStreamer->emitULEB128IntValue(SuccCount);
+          for (const MachineBasicBlock *SuccMBB : MBB.successors()) {
+            OutStreamer->AddComment("successor BB ID");
+            OutStreamer->emitULEB128IntValue(SuccMBB->getBBID()->BaseID);
+            OutStreamer->AddComment("successor branch probability");
+            OutStreamer->emitULEB128IntValue(
+                MBPI->getEdgeProbability(&MBB, SuccMBB).getNumerator());
+          }
+        }
+      }
+    }
+  }
+
   OutStreamer->popSection();
 }
 
@@ -1934,8 +2002,14 @@ void AsmPrinter::emitFunctionBody() {
 
   // Emit section containing BB address offsets and their metadata, when
   // BB labels are requested for this function. Skip empty functions.
-  if (MF->hasBBLabels() && HasAnyRealCode)
-    emitBBAddrMapSection(*MF);
+  if (HasAnyRealCode) {
+    if (MF->hasBBLabels())
+      emitBBAddrMapSection(*MF);
+    else if (PgoAnalysisMapFeatures.getBits() != 0)
+      MF->getContext().reportWarning(
+          SMLoc(), "pgo-analysis-map is enabled for function " + MF->getName() +
+                       " but it does not have labels");
+  }
 
   // Emit sections containing instruction and function PCs.
   emitPCSections(*MF);

diff  --git a/llvm/test/CodeGen/X86/basic-block-sections-labels-empty-function.ll b/llvm/test/CodeGen/X86/basic-block-sections-labels-empty-function.ll
index 7b7bbb95fb4e21..42d09212e66916 100644
--- a/llvm/test/CodeGen/X86/basic-block-sections-labels-empty-function.ll
+++ b/llvm/test/CodeGen/X86/basic-block-sections-labels-empty-function.ll
@@ -1,5 +1,6 @@
 ;; Verify that the BB address map is not emitted for empty functions.
-; RUN: llc < %s -mtriple=x86_64 -basic-block-sections=labels | FileCheck %s
+; RUN: llc < %s -mtriple=x86_64 -basic-block-sections=labels | FileCheck %s --check-prefixes=CHECK,BASIC
+; RUN: llc < %s -mtriple=x86_64 -basic-block-sections=labels -pgo-analysis-map=func-entry-count,bb-freq | FileCheck %s --check-prefixes=CHECK,PGO
 
 define void @empty_func() {
 entry:
@@ -19,5 +20,6 @@ entry:
 ; CHECK:	.Lfunc_begin1:
 ; CHECK:		.section	.llvm_bb_addr_map,"o", at llvm_bb_addr_map,.text{{$}}
 ; CHECK-NEXT:		.byte 2			# version
-; CHECK-NEXT:		.byte 0			# feature
+; BASIC-NEXT:		.byte 0			# feature
+; PGO-NEXT:		.byte 3			# feature
 ; CHECK-NEXT:		.quad	.Lfunc_begin1	# function address

diff  --git a/llvm/test/CodeGen/X86/basic-block-sections-labels-pgo-features.ll b/llvm/test/CodeGen/X86/basic-block-sections-labels-pgo-features.ll
new file mode 100644
index 00000000000000..92d3c88b4f6013
--- /dev/null
+++ b/llvm/test/CodeGen/X86/basic-block-sections-labels-pgo-features.ll
@@ -0,0 +1,127 @@
+; Check the basic block sections labels option
+; RUN: llc < %s -mtriple=x86_64 -function-sections -unique-section-names=true -basic-block-sections=labels | FileCheck %s --check-prefixes=CHECK,BASIC
+
+;; Also verify this holds for all PGO features enabled
+; RUN: llc < %s -mtriple=x86_64 -function-sections -unique-section-names=true -basic-block-sections=labels -pgo-analysis-map=func-entry-count,bb-freq,br-prob | FileCheck %s --check-prefixes=CHECK,PGO-ALL,PGO-FEC,PGO-BBF,PGO-BRP
+
+;; Also verify that pgo extension only includes the enabled feature
+; RUN: llc < %s -mtriple=x86_64 -function-sections -unique-section-names=true -basic-block-sections=labels -pgo-analysis-map=func-entry-count | FileCheck %s --check-prefixes=CHECK,PGO-FEC,FEC-ONLY
+; RUN: llc < %s -mtriple=x86_64 -function-sections -unique-section-names=true -basic-block-sections=labels -pgo-analysis-map=bb-freq | FileCheck %s --check-prefixes=CHECK,PGO-BBF,BBF-ONLY
+; RUN: llc < %s -mtriple=x86_64 -function-sections -unique-section-names=true -basic-block-sections=labels -pgo-analysis-map=br-prob | FileCheck %s --check-prefixes=CHECK,PGO-BRP,BRP-ONLY
+
+
+define void @_Z3bazb(i1 zeroext, i1 zeroext) personality ptr @__gxx_personality_v0 !prof !0  {
+  br i1 %0, label %3, label %8, !prof !1
+
+3:
+  %4 = invoke i32 @_Z3barv()
+          to label %8 unwind label %6
+  br label %10
+
+6:
+  landingpad { ptr, i32 }
+          catch ptr null
+  br label %12
+
+8:
+  %9 = call i32 @_Z3foov()
+  br i1 %1, label %12, label %10, !prof !2
+
+10:
+  %11 = select i1 %1, ptr blockaddress(@_Z3bazb, %3), ptr blockaddress(@_Z3bazb, %12) ; <ptr> [#uses=1]
+  indirectbr ptr %11, [label %3, label %12], !prof !3
+
+12:
+  ret void
+}
+
+declare i32 @_Z3barv() #1
+
+declare i32 @_Z3foov() #1
+
+declare i32 @__gxx_personality_v0(...)
+
+!0 = !{!"function_entry_count", i64 100}
+!1 = !{!"branch_weights", i32 80, i32 20}
+!2 = !{!"branch_weights", i32 70, i32 10}
+!3 = !{!"branch_weights", i32 15, i32 5}
+
+; CHECK:	.section .text._Z3bazb,"ax", at progbits{{$}}
+; CHECK-LABEL:	_Z3bazb:
+; CHECK-LABEL:	.Lfunc_begin0:
+; CHECK-LABEL:	.LBB_END0_0:
+; CHECK-LABEL:	.LBB0_1:
+; CHECK-LABEL:	.LBB_END0_1:
+; CHECK-LABEL:	.LBB0_2:
+; CHECK-LABEL:	.LBB_END0_2:
+; CHECK-LABEL:	.LBB0_3:
+; CHECK-LABEL:	.LBB_END0_3:
+; CHECK-LABEL:	.Lfunc_end0:
+
+; CHECK: 	.section	.llvm_bb_addr_map,"o", at llvm_bb_addr_map,.text._Z3bazb{{$}}
+; CHECK-NEXT:	.byte	2		# version
+; BASIC-NEXT:	.byte	0		# feature
+; PGO-ALL-NEXT:	.byte	7		# feature
+; FEC-ONLY-NEXT:.byte	1		# feature
+; BBF-ONLY-NEXT:.byte	2		# feature
+; BRP-ONLY-NEXT:.byte	4		# feature
+; CHECK-NEXT:	.quad	.Lfunc_begin0	# function address
+; CHECK-NEXT:	.byte	6		# number of basic blocks
+; CHECK-NEXT:	.byte	0		# BB id
+; CHECK-NEXT:	.uleb128 .Lfunc_begin0-.Lfunc_begin0
+; CHECK-NEXT:	.uleb128 .LBB_END0_0-.Lfunc_begin0
+; CHECK-NEXT:	.byte	8
+; CHECK-NEXT:	.byte	1		# BB id
+; CHECK-NEXT:	.uleb128 .LBB0_1-.LBB_END0_0
+; CHECK-NEXT:	.uleb128 .LBB_END0_1-.LBB0_1
+; CHECK-NEXT:	.byte	8
+; CHECK-NEXT:	.byte	3		# BB id
+; CHECK-NEXT:	.uleb128 .LBB0_2-.LBB_END0_1
+; CHECK-NEXT:	.uleb128 .LBB_END0_2-.LBB0_2
+; CHECK-NEXT:	.byte	8
+; CHECK-NEXT:	.byte	5		# BB id
+; CHECK-NEXT:	.uleb128 .LBB0_3-.LBB_END0_2
+; CHECK-NEXT:	.uleb128 .LBB_END0_3-.LBB0_3
+; CHECK-NEXT:	.byte	1
+; CHECK-NEXT:	.byte	4		# BB id
+; CHECK-NEXT:	.uleb128 .LBB0_4-.LBB_END0_3
+; CHECK-NEXT:	.uleb128 .LBB_END0_4-.LBB0_4
+; CHECK-NEXT:	.byte	16
+; CHECK-NEXT:	.byte	2		# BB id
+; CHECK-NEXT:	.uleb128 .LBB0_5-.LBB_END0_4
+; CHECK-NEXT:	.uleb128 .LBB_END0_5-.LBB0_5
+; CHECK-NEXT:	.byte	4
+
+;; PGO Analysis Map
+; PGO-FEC-NEXT:	.byte	100		# function entry count
+; PGO-BBF-NEXT:	.ascii	"\271\235\376\332\245\200\356\017"	# basic block frequency
+; PGO-BRP-NEXT:	.byte	2		# basic block successor count
+; PGO-BRP-NEXT:	.byte	1		# successor BB ID
+; PGO-BRP-NEXT:	.ascii	"\346\314\231\263\006"	# successor branch probability
+; PGO-BRP-NEXT:	.byte	3		# successor BB ID
+; PGO-BRP-NEXT:	.ascii	"\232\263\346\314\001"	# successor branch probability
+; PGO-BBF-NEXT:	.ascii	"\202\301\341\375\205\200\200\003"	# basic block frequency
+; PGO-BRP-NEXT:	.byte	2		# basic block successor count
+; PGO-BRP-NEXT:	.byte	3		# successor BB ID
+; PGO-BRP-NEXT:	.ascii	"\200\360\377\377\007"	# successor branch probability
+; PGO-BRP-NEXT:	.byte	2		# successor BB ID
+; PGO-BRP-NEXT:	.ascii	"\200\020"	# successor branch probability
+; PGO-BBF-NEXT:	.ascii	"\200\200\200\200\200\200\200 "	# basic block frequency
+; PGO-BRP-NEXT:	.byte	2		# basic block successor count
+; PGO-BRP-NEXT:	.byte	5		# successor BB ID
+; PGO-BRP-NEXT:	.ascii	"\200\200\200\200\007"	# successor branch probability
+; PGO-BRP-NEXT:	.byte 	4		# successor BB ID
+; PGO-BRP-NEXT:	.ascii	"\200\200\200\200\001"	# successor branch probability
+; PGO-BBF-NEXT:	.ascii	"\271\235\376\332\245\200\356\017"	# basic block frequency
+; PGO-BRP-NEXT:	.byte	0		# basic block successor count
+; PGO-BBF-NEXT:	.ascii	"\210\214\356\257\200\200\230\002"	# basic block frequency
+; PGO-BRP-NEXT:	.byte	2		# basic block successor count
+; PGO-BRP-NEXT:	.byte 	1		# successor BB ID
+; PGO-BRP-NEXT:	.ascii	"\200\200\200\200\006"	# successor branch probability
+; PGO-BRP-NEXT:	.byte	5		# successor BB ID
+; PGO-BRP-NEXT:	.ascii	"\200\200\200\200\002"	# successor branch probability
+; PGO-BBF-NEXT:	.ascii	"\235\323\243\200#"	# basic block frequency
+; PGO-BRP-NEXT:	.byte	1		# basic block successor count
+; PGO-BRP-NEXT:	.byte	5		# successor BB ID
+; PGO-BRP-NEXT:	.ascii	"\200\200\200\200\b"	# successor branch probability
+


        


More information about the llvm-commits mailing list