[llvm] 7df28fd - [SHT_LLVM_BB_ADDR_MAP][AsmPrinter] Implements PGOAnalysisMap emitting in AsmPrinter with tests. (#75202)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 3 16:17:48 PST 2024
Author: Micah Weston
Date: 2024-01-03T19:17:44-05:00
New Revision: 7df28fd61aa4603846b3ce16f9f988ccc780a584
URL: https://github.com/llvm/llvm-project/commit/7df28fd61aa4603846b3ce16f9f988ccc780a584
DIFF: https://github.com/llvm/llvm-project/commit/7df28fd61aa4603846b3ce16f9f988ccc780a584.diff
LOG: [SHT_LLVM_BB_ADDR_MAP][AsmPrinter] Implements PGOAnalysisMap emitting in AsmPrinter with tests. (#75202)
Uses machine analyses to emit PGOAnalysisMap into the bb-addr-map ELF
section. Implements filecheck tests to verify emitting new fields.
This patch emits optional PGO related analyses into the bb-addr-map ELF
section during AsmPrinter. This currently supports Function Entry Count,
Machine Block Frequencies. and Machine Branch Probabilities. Each is
independently enabled via the `feature` byte of `bb-addr-map` for the given
function.
A part of [RFC - PGO Accuracy Metrics: Emitting and Evaluating Branch and Block Analysis](https://discourse.llvm.org/t/rfc-pgo-accuracy-metrics-emitting-and-evaluating-branch-and-block-analysis/73902).
Added:
llvm/test/CodeGen/X86/basic-block-sections-labels-pgo-features.ll
Modified:
llvm/docs/Extensions.rst
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
llvm/test/CodeGen/X86/basic-block-sections-labels-empty-function.ll
Removed:
################################################################################
diff --git a/llvm/docs/Extensions.rst b/llvm/docs/Extensions.rst
index 6e94840897832a..74ca8cb0aa6879 100644
--- a/llvm/docs/Extensions.rst
+++ b/llvm/docs/Extensions.rst
@@ -451,6 +451,90 @@ Example:
.uleb128 .LBB_END0_1-.LBB0_1 # BB_1 size
.byte y # BB_1 metadata
+PGO Analysis Map
+""""""""""""""""
+
+PGO related analysis data can be emitted after each function within the
+``SHT_LLVM_BB_ADDR_MAP`` through the optional ``pgo-analysis-map`` flag.
+Supported analyses currently are Function Entry Count, Basic Block Frequencies,
+and Branch Probabilities.
+
+Each analysis is enabled or disabled via a bit in the feature byte. Currently
+those bits are:
+
+#. Function Entry Count - Number of times the function was called as taken
+ from a PGO profile. This will always be zero if PGO was not used or the
+ function was not encountered in the profile.
+
+#. Basic Block Frequencies - Encoded as raw block frequency value taken from
+ MBFI analysis. This value is an integer that encodes the relative frequency
+ compared to the entry block. More information can be found in
+ 'llvm/Support/BlockFrequency.h'.
+
+#. Branch Probabilities - Encoded as raw numerator for branch probability
+ taken from MBPI analysis. This value is the numerator for a fixed point ratio
+ defined in 'llvm/Support/BranchProbability.h'. It indicates the probability
+ that the block is followed by a given successor block during execution.
+
+This extra data requires version 2 or above. This is necessary since successors
+of basic blocks won't know their index but will know their BB ID.
+
+Example of BBAddrMap with PGO data:
+
+.. code-block:: gas
+
+ .section ".llvm_bb_addr_map","", at llvm_bb_addr_map
+ .byte 2 # version number
+ .byte 7 # feature byte - PGO analyses enabled mask
+ .quad .Lfunc_begin0 # address of the function
+ .uleb128 4 # number of basic blocks
+ # BB record for BB_0
+ .uleb128 0 # BB_0 BB ID
+ .uleb128 .Lfunc_begin0-.Lfunc_begin0 # BB_0 offset relative to function entry (always zero)
+ .uleb128 .LBB_END0_0-.Lfunc_begin0 # BB_0 size
+ .byte 0x18 # BB_0 metadata (multiple successors)
+ # BB record for BB_1
+ .uleb128 1 # BB_1 BB ID
+ .uleb128 .LBB0_1-.LBB_END0_0 # BB_1 offset relative to the end of last block (BB_0).
+ .uleb128 .LBB_END0_1-.LBB0_1 # BB_1 size
+ .byte 0x0 # BB_1 metadata (two successors)
+ # BB record for BB_2
+ .uleb128 2 # BB_2 BB ID
+ .uleb128 .LBB0_2-.LBB_END1_0 # BB_2 offset relative to the end of last block (BB_1).
+ .uleb128 .LBB_END0_2-.LBB0_2 # BB_2 size
+ .byte 0x0 # BB_2 metadata (one successor)
+ # BB record for BB_3
+ .uleb128 3 # BB_3 BB ID
+ .uleb128 .LBB0_3-.LBB_END0_2 # BB_3 offset relative to the end of last block (BB_2).
+ .uleb128 .LBB_END0_3-.LBB0_3 # BB_3 size
+ .byte 0x0 # BB_3 metadata (zero successors)
+ # PGO Analysis Map
+ .uleb128 1000 # function entry count (only when enabled)
+ # PGO data record for BB_0
+ .uleb128 1000 # BB_0 basic block frequency (only when enabled)
+ .uleb128 3 # BB_0 successors count (only enabled with branch probabilities)
+ .uleb128 1 # BB_0 successor 1 BB ID (only enabled with branch probabilities)
+ .uleb128 0x22222222 # BB_0 successor 1 branch probability (only enabled with branch probabilities)
+ .uleb128 2 # BB_0 successor 2 BB ID (only enabled with branch probabilities)
+ .uleb128 0x33333333 # BB_0 successor 2 branch probability (only enabled with branch probabilities)
+ .uleb128 3 # BB_0 successor 3 BB ID (only enabled with branch probabilities)
+ .uleb128 0xaaaaaaaa # BB_0 successor 3 branch probability (only enabled with branch probabilities)
+ # PGO data record for BB_1
+ .uleb128 133 # BB_1 basic block frequency (only when enabled)
+ .uleb128 2 # BB_1 successors count (only enabled with branch probabilities)
+ .uleb128 2 # BB_1 successor 1 BB ID (only enabled with branch probabilities)
+ .uleb128 0x11111111 # BB_1 successor 1 branch probability (only enabled with branch probabilities)
+ .uleb128 3 # BB_1 successor 2 BB ID (only enabled with branch probabilities)
+ .uleb128 0x11111111 # BB_1 successor 2 branch probability (only enabled with branch probabilities)
+ # PGO data record for BB_2
+ .uleb128 18 # BB_2 basic block frequency (only when enabled)
+ .uleb128 1 # BB_2 successors count (only enabled with branch probabilities)
+ .uleb128 3 # BB_2 successor 1 BB ID (only enabled with branch probabilities)
+ .uleb128 0xffffffff # BB_2 successor 1 branch probability (only enabled with branch probabilities)
+ # PGO data record for BB_3
+ .uleb128 1000 # BB_3 basic block frequency (only when enabled)
+ .uleb128 0 # BB_3 successors count (only enabled with branch probabilities)
+
``SHT_LLVM_OFFLOADING`` Section (offloading data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This section stores the binary data used to perform offloading device linking
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 4dd27702786e42..7df1c82bf357f6 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -40,6 +40,7 @@
#include "llvm/CodeGen/GCMetadataPrinter.h"
#include "llvm/CodeGen/LazyMachineBlockFrequencyInfo.h"
#include "llvm/CodeGen/MachineBasicBlock.h"
+#include "llvm/CodeGen/MachineBranchProbabilityInfo.h"
#include "llvm/CodeGen/MachineConstantPool.h"
#include "llvm/CodeGen/MachineDominators.h"
#include "llvm/CodeGen/MachineFrameInfo.h"
@@ -140,6 +141,26 @@ static cl::opt<std::string> BasicBlockProfileDump(
"performed with -basic-block-sections=labels. Enabling this "
"flag during in-process ThinLTO is not supported."));
+// This is a replication of fields of object::PGOAnalysisMap::Features. It
+// should match the order of the fields so that
+// `object::PGOAnalysisMap::Features::decode(PgoAnalysisMapFeatures.getBits())`
+// succeeds.
+enum class PGOMapFeaturesEnum {
+ FuncEntryCount,
+ BBFreq,
+ BrProb,
+};
+static cl::bits<PGOMapFeaturesEnum> PgoAnalysisMapFeatures(
+ "pgo-analysis-map", cl::Hidden, cl::CommaSeparated,
+ cl::values(clEnumValN(PGOMapFeaturesEnum::FuncEntryCount,
+ "func-entry-count", "Function Entry Count"),
+ clEnumValN(PGOMapFeaturesEnum::BBFreq, "bb-freq",
+ "Basic Block Frequency"),
+ clEnumValN(PGOMapFeaturesEnum::BrProb, "br-prob",
+ "Branch Probability")),
+ cl::desc("Enable extended information within the BBAddrMap that is "
+ "extracted from PGO related analysis."));
+
const char DWARFGroupName[] = "dwarf";
const char DWARFGroupDescription[] = "DWARF Emission";
const char DbgTimerName[] = "emit";
@@ -428,6 +449,7 @@ void AsmPrinter::getAnalysisUsage(AnalysisUsage &AU) const {
AU.addRequired<MachineOptimizationRemarkEmitterPass>();
AU.addRequired<GCModuleInfo>();
AU.addRequired<LazyMachineBlockFrequencyInfoPass>();
+ AU.addRequired<MachineBranchProbabilityInfo>();
}
bool AsmPrinter::doInitialization(Module &M) {
@@ -1379,7 +1401,8 @@ void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {
uint8_t BBAddrMapVersion = OutStreamer->getContext().getBBAddrMapVersion();
OutStreamer->emitInt8(BBAddrMapVersion);
OutStreamer->AddComment("feature");
- OutStreamer->emitInt8(0);
+ auto FeaturesBits = static_cast<uint8_t>(PgoAnalysisMapFeatures.getBits());
+ OutStreamer->emitInt8(FeaturesBits);
OutStreamer->AddComment("function address");
OutStreamer->emitSymbolValue(FunctionSymbol, getPointerSize());
OutStreamer->AddComment("number of basic blocks");
@@ -1409,6 +1432,51 @@ void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {
OutStreamer->emitULEB128IntValue(getBBAddrMapMetadata(MBB));
PrevMBBEndSymbol = MBB.getEndSymbol();
}
+
+ if (FeaturesBits != 0) {
+ assert(BBAddrMapVersion >= 2 &&
+ "PGOAnalysisMap only supports version 2 or later");
+
+ auto FeatEnable =
+ cantFail(object::PGOAnalysisMap::Features::decode(FeaturesBits));
+
+ if (FeatEnable.FuncEntryCount) {
+ OutStreamer->AddComment("function entry count");
+ auto MaybeEntryCount = MF.getFunction().getEntryCount();
+ OutStreamer->emitULEB128IntValue(
+ MaybeEntryCount ? MaybeEntryCount->getCount() : 0);
+ }
+ const MachineBlockFrequencyInfo *MBFI =
+ FeatEnable.BBFreq
+ ? &getAnalysis<LazyMachineBlockFrequencyInfoPass>().getBFI()
+ : nullptr;
+ const MachineBranchProbabilityInfo *MBPI =
+ FeatEnable.BrProb ? &getAnalysis<MachineBranchProbabilityInfo>()
+ : nullptr;
+
+ if (FeatEnable.BBFreq || FeatEnable.BrProb) {
+ for (const MachineBasicBlock &MBB : MF) {
+ if (FeatEnable.BBFreq) {
+ OutStreamer->AddComment("basic block frequency");
+ OutStreamer->emitULEB128IntValue(
+ MBFI->getBlockFreq(&MBB).getFrequency());
+ }
+ if (FeatEnable.BrProb) {
+ unsigned SuccCount = MBB.succ_size();
+ OutStreamer->AddComment("basic block successor count");
+ OutStreamer->emitULEB128IntValue(SuccCount);
+ for (const MachineBasicBlock *SuccMBB : MBB.successors()) {
+ OutStreamer->AddComment("successor BB ID");
+ OutStreamer->emitULEB128IntValue(SuccMBB->getBBID()->BaseID);
+ OutStreamer->AddComment("successor branch probability");
+ OutStreamer->emitULEB128IntValue(
+ MBPI->getEdgeProbability(&MBB, SuccMBB).getNumerator());
+ }
+ }
+ }
+ }
+ }
+
OutStreamer->popSection();
}
@@ -1934,8 +2002,14 @@ void AsmPrinter::emitFunctionBody() {
// Emit section containing BB address offsets and their metadata, when
// BB labels are requested for this function. Skip empty functions.
- if (MF->hasBBLabels() && HasAnyRealCode)
- emitBBAddrMapSection(*MF);
+ if (HasAnyRealCode) {
+ if (MF->hasBBLabels())
+ emitBBAddrMapSection(*MF);
+ else if (PgoAnalysisMapFeatures.getBits() != 0)
+ MF->getContext().reportWarning(
+ SMLoc(), "pgo-analysis-map is enabled for function " + MF->getName() +
+ " but it does not have labels");
+ }
// Emit sections containing instruction and function PCs.
emitPCSections(*MF);
diff --git a/llvm/test/CodeGen/X86/basic-block-sections-labels-empty-function.ll b/llvm/test/CodeGen/X86/basic-block-sections-labels-empty-function.ll
index 7b7bbb95fb4e21..42d09212e66916 100644
--- a/llvm/test/CodeGen/X86/basic-block-sections-labels-empty-function.ll
+++ b/llvm/test/CodeGen/X86/basic-block-sections-labels-empty-function.ll
@@ -1,5 +1,6 @@
;; Verify that the BB address map is not emitted for empty functions.
-; RUN: llc < %s -mtriple=x86_64 -basic-block-sections=labels | FileCheck %s
+; RUN: llc < %s -mtriple=x86_64 -basic-block-sections=labels | FileCheck %s --check-prefixes=CHECK,BASIC
+; RUN: llc < %s -mtriple=x86_64 -basic-block-sections=labels -pgo-analysis-map=func-entry-count,bb-freq | FileCheck %s --check-prefixes=CHECK,PGO
define void @empty_func() {
entry:
@@ -19,5 +20,6 @@ entry:
; CHECK: .Lfunc_begin1:
; CHECK: .section .llvm_bb_addr_map,"o", at llvm_bb_addr_map,.text{{$}}
; CHECK-NEXT: .byte 2 # version
-; CHECK-NEXT: .byte 0 # feature
+; BASIC-NEXT: .byte 0 # feature
+; PGO-NEXT: .byte 3 # feature
; CHECK-NEXT: .quad .Lfunc_begin1 # function address
diff --git a/llvm/test/CodeGen/X86/basic-block-sections-labels-pgo-features.ll b/llvm/test/CodeGen/X86/basic-block-sections-labels-pgo-features.ll
new file mode 100644
index 00000000000000..92d3c88b4f6013
--- /dev/null
+++ b/llvm/test/CodeGen/X86/basic-block-sections-labels-pgo-features.ll
@@ -0,0 +1,127 @@
+; Check the basic block sections labels option
+; RUN: llc < %s -mtriple=x86_64 -function-sections -unique-section-names=true -basic-block-sections=labels | FileCheck %s --check-prefixes=CHECK,BASIC
+
+;; Also verify this holds for all PGO features enabled
+; RUN: llc < %s -mtriple=x86_64 -function-sections -unique-section-names=true -basic-block-sections=labels -pgo-analysis-map=func-entry-count,bb-freq,br-prob | FileCheck %s --check-prefixes=CHECK,PGO-ALL,PGO-FEC,PGO-BBF,PGO-BRP
+
+;; Also verify that pgo extension only includes the enabled feature
+; RUN: llc < %s -mtriple=x86_64 -function-sections -unique-section-names=true -basic-block-sections=labels -pgo-analysis-map=func-entry-count | FileCheck %s --check-prefixes=CHECK,PGO-FEC,FEC-ONLY
+; RUN: llc < %s -mtriple=x86_64 -function-sections -unique-section-names=true -basic-block-sections=labels -pgo-analysis-map=bb-freq | FileCheck %s --check-prefixes=CHECK,PGO-BBF,BBF-ONLY
+; RUN: llc < %s -mtriple=x86_64 -function-sections -unique-section-names=true -basic-block-sections=labels -pgo-analysis-map=br-prob | FileCheck %s --check-prefixes=CHECK,PGO-BRP,BRP-ONLY
+
+
+define void @_Z3bazb(i1 zeroext, i1 zeroext) personality ptr @__gxx_personality_v0 !prof !0 {
+ br i1 %0, label %3, label %8, !prof !1
+
+3:
+ %4 = invoke i32 @_Z3barv()
+ to label %8 unwind label %6
+ br label %10
+
+6:
+ landingpad { ptr, i32 }
+ catch ptr null
+ br label %12
+
+8:
+ %9 = call i32 @_Z3foov()
+ br i1 %1, label %12, label %10, !prof !2
+
+10:
+ %11 = select i1 %1, ptr blockaddress(@_Z3bazb, %3), ptr blockaddress(@_Z3bazb, %12) ; <ptr> [#uses=1]
+ indirectbr ptr %11, [label %3, label %12], !prof !3
+
+12:
+ ret void
+}
+
+declare i32 @_Z3barv() #1
+
+declare i32 @_Z3foov() #1
+
+declare i32 @__gxx_personality_v0(...)
+
+!0 = !{!"function_entry_count", i64 100}
+!1 = !{!"branch_weights", i32 80, i32 20}
+!2 = !{!"branch_weights", i32 70, i32 10}
+!3 = !{!"branch_weights", i32 15, i32 5}
+
+; CHECK: .section .text._Z3bazb,"ax", at progbits{{$}}
+; CHECK-LABEL: _Z3bazb:
+; CHECK-LABEL: .Lfunc_begin0:
+; CHECK-LABEL: .LBB_END0_0:
+; CHECK-LABEL: .LBB0_1:
+; CHECK-LABEL: .LBB_END0_1:
+; CHECK-LABEL: .LBB0_2:
+; CHECK-LABEL: .LBB_END0_2:
+; CHECK-LABEL: .LBB0_3:
+; CHECK-LABEL: .LBB_END0_3:
+; CHECK-LABEL: .Lfunc_end0:
+
+; CHECK: .section .llvm_bb_addr_map,"o", at llvm_bb_addr_map,.text._Z3bazb{{$}}
+; CHECK-NEXT: .byte 2 # version
+; BASIC-NEXT: .byte 0 # feature
+; PGO-ALL-NEXT: .byte 7 # feature
+; FEC-ONLY-NEXT:.byte 1 # feature
+; BBF-ONLY-NEXT:.byte 2 # feature
+; BRP-ONLY-NEXT:.byte 4 # feature
+; CHECK-NEXT: .quad .Lfunc_begin0 # function address
+; CHECK-NEXT: .byte 6 # number of basic blocks
+; CHECK-NEXT: .byte 0 # BB id
+; CHECK-NEXT: .uleb128 .Lfunc_begin0-.Lfunc_begin0
+; CHECK-NEXT: .uleb128 .LBB_END0_0-.Lfunc_begin0
+; CHECK-NEXT: .byte 8
+; CHECK-NEXT: .byte 1 # BB id
+; CHECK-NEXT: .uleb128 .LBB0_1-.LBB_END0_0
+; CHECK-NEXT: .uleb128 .LBB_END0_1-.LBB0_1
+; CHECK-NEXT: .byte 8
+; CHECK-NEXT: .byte 3 # BB id
+; CHECK-NEXT: .uleb128 .LBB0_2-.LBB_END0_1
+; CHECK-NEXT: .uleb128 .LBB_END0_2-.LBB0_2
+; CHECK-NEXT: .byte 8
+; CHECK-NEXT: .byte 5 # BB id
+; CHECK-NEXT: .uleb128 .LBB0_3-.LBB_END0_2
+; CHECK-NEXT: .uleb128 .LBB_END0_3-.LBB0_3
+; CHECK-NEXT: .byte 1
+; CHECK-NEXT: .byte 4 # BB id
+; CHECK-NEXT: .uleb128 .LBB0_4-.LBB_END0_3
+; CHECK-NEXT: .uleb128 .LBB_END0_4-.LBB0_4
+; CHECK-NEXT: .byte 16
+; CHECK-NEXT: .byte 2 # BB id
+; CHECK-NEXT: .uleb128 .LBB0_5-.LBB_END0_4
+; CHECK-NEXT: .uleb128 .LBB_END0_5-.LBB0_5
+; CHECK-NEXT: .byte 4
+
+;; PGO Analysis Map
+; PGO-FEC-NEXT: .byte 100 # function entry count
+; PGO-BBF-NEXT: .ascii "\271\235\376\332\245\200\356\017" # basic block frequency
+; PGO-BRP-NEXT: .byte 2 # basic block successor count
+; PGO-BRP-NEXT: .byte 1 # successor BB ID
+; PGO-BRP-NEXT: .ascii "\346\314\231\263\006" # successor branch probability
+; PGO-BRP-NEXT: .byte 3 # successor BB ID
+; PGO-BRP-NEXT: .ascii "\232\263\346\314\001" # successor branch probability
+; PGO-BBF-NEXT: .ascii "\202\301\341\375\205\200\200\003" # basic block frequency
+; PGO-BRP-NEXT: .byte 2 # basic block successor count
+; PGO-BRP-NEXT: .byte 3 # successor BB ID
+; PGO-BRP-NEXT: .ascii "\200\360\377\377\007" # successor branch probability
+; PGO-BRP-NEXT: .byte 2 # successor BB ID
+; PGO-BRP-NEXT: .ascii "\200\020" # successor branch probability
+; PGO-BBF-NEXT: .ascii "\200\200\200\200\200\200\200 " # basic block frequency
+; PGO-BRP-NEXT: .byte 2 # basic block successor count
+; PGO-BRP-NEXT: .byte 5 # successor BB ID
+; PGO-BRP-NEXT: .ascii "\200\200\200\200\007" # successor branch probability
+; PGO-BRP-NEXT: .byte 4 # successor BB ID
+; PGO-BRP-NEXT: .ascii "\200\200\200\200\001" # successor branch probability
+; PGO-BBF-NEXT: .ascii "\271\235\376\332\245\200\356\017" # basic block frequency
+; PGO-BRP-NEXT: .byte 0 # basic block successor count
+; PGO-BBF-NEXT: .ascii "\210\214\356\257\200\200\230\002" # basic block frequency
+; PGO-BRP-NEXT: .byte 2 # basic block successor count
+; PGO-BRP-NEXT: .byte 1 # successor BB ID
+; PGO-BRP-NEXT: .ascii "\200\200\200\200\006" # successor branch probability
+; PGO-BRP-NEXT: .byte 5 # successor BB ID
+; PGO-BRP-NEXT: .ascii "\200\200\200\200\002" # successor branch probability
+; PGO-BBF-NEXT: .ascii "\235\323\243\200#" # basic block frequency
+; PGO-BRP-NEXT: .byte 1 # basic block successor count
+; PGO-BRP-NEXT: .byte 5 # successor BB ID
+; PGO-BRP-NEXT: .ascii "\200\200\200\200\b" # successor branch probability
+
More information about the llvm-commits
mailing list