[llvm] [PGO] Ensure non-zero entry-count after `populateCounters` (PR #112029)
Michael O'Farrell via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 22 13:34:23 PDT 2024
https://github.com/mofarrell updated https://github.com/llvm/llvm-project/pull/112029
>From 320a89eb3f4cd3bd16e294e402aa4b6b21aeb94e Mon Sep 17 00:00:00 2001
From: Michael O'Farrell <micpof at gmail.com>
Date: Tue, 8 Oct 2024 10:41:11 -0700
Subject: [PATCH 1/2] [PGO] Gracefully handle zero entry-count
With sampled instrumentation (#69535), profile counts can appear
corrupt. In particular a function can have a 0 block counts for all its
blocks, while having some non-zero counters for select instrumentation.
This is only possible for colder functions, and a reasonable
modification to ensure the entry is non-zero (required by
`fixFuncEntryCounts`) is to set the counter to one. This is only likely
to happen for colder functions, so it is reasonable to take any action
that does not crash.
---
.../Instrumentation/PGOInstrumentation.cpp | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp b/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
index dbe908bb5e72f3..0da981a54593a5 100644
--- a/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
+++ b/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
@@ -1615,6 +1615,10 @@ void PGOUseFunc::populateCounters() {
assert(BI->Count && "BB count is not valid");
}
#endif
+ // Now annotate select instructions. This may fixup impossible block counts.
+ FuncInfo.SIVisitor.annotateSelects(this, &CountPosition);
+ assert(CountPosition == ProfileCountSize);
+
uint64_t FuncEntryCount = *getBBInfo(&*F.begin()).Count;
uint64_t FuncMaxCount = FuncEntryCount;
for (auto &BB : F) {
@@ -1630,10 +1634,6 @@ void PGOUseFunc::populateCounters() {
F.setEntryCount(ProfileCount(FuncEntryCount, Function::PCT_Real));
markFunctionAttributes(FuncEntryCount, FuncMaxCount);
- // Now annotate select instructions
- FuncInfo.SIVisitor.annotateSelects(this, &CountPosition);
- assert(CountPosition == ProfileCountSize);
-
LLVM_DEBUG(FuncInfo.dumpInfo("after reading profile."));
}
@@ -1742,8 +1742,13 @@ void SelectInstVisitor::annotateOneSelectInst(SelectInst &SI) {
++(*CurCtrIdx);
uint64_t TotalCount = 0;
auto BI = UseFunc->findBBInfo(SI.getParent());
- if (BI != nullptr)
+ if (BI != nullptr) {
TotalCount = *BI->Count;
+
+ // Fix the block count if it is impossible.
+ if (TotalCount < SCounts[0])
+ BI->Count = SCounts[0];
+ }
// False Count
SCounts[1] = (TotalCount > SCounts[0] ? TotalCount - SCounts[0] : 0);
uint64_t MaxCount = std::max(SCounts[0], SCounts[1]);
>From bcc71abca1c66e0468e4eb9b799a4ff2062773ee Mon Sep 17 00:00:00 2001
From: Michael O'Farrell <micpof at gmail.com>
Date: Wed, 16 Oct 2024 12:25:15 -0700
Subject: [PATCH 2/2] [PGO] Test graceful handling of zero profile counts
This crashed before making fixFuncEntryCount gracefully handle zero
counts.
---
.../PGOProfile/fix_entry_count_sampled.ll | 42 +++++++++++++++++++
1 file changed, 42 insertions(+)
create mode 100644 llvm/test/Transforms/PGOProfile/fix_entry_count_sampled.ll
diff --git a/llvm/test/Transforms/PGOProfile/fix_entry_count_sampled.ll b/llvm/test/Transforms/PGOProfile/fix_entry_count_sampled.ll
new file mode 100644
index 00000000000000..91476d3a16b1cc
--- /dev/null
+++ b/llvm/test/Transforms/PGOProfile/fix_entry_count_sampled.ll
@@ -0,0 +1,42 @@
+; RUN: rm -rf %t && split-file %s %t
+
+; RUN: llvm-profdata merge %t/main.proftext -o %t/main.profdata
+; RUN: opt < %t/main.ll -passes=pgo-instr-use -pgo-test-profile-file=%t/main.profdata -S | FileCheck %s
+
+;--- main.ll
+
+; Instrumentation PGO sampling makes corrupt looking counters possible. This
+; tests one extreme case:
+; Test loading zero profile counts for all instrumented blocks while the entry
+; block is not instrumented. Additionally include a non-zero profile count for
+; a select instruction, which prevents short circuiting the PGO application.
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define i32 @test_no_entry_block_counter(i32 %n) {
+; CHECK: define i32 @test_no_entry_block_counter(i32 %n)
+; CHECK-SAME: !prof ![[ENTRY_COUNT:[0-9]*]]
+entry:
+ %cmp = icmp slt i32 42, %n
+ br i1 %cmp, label %tail1, label %tail2
+tail1:
+ %ret = select i1 true, i32 %n, i32 42
+; CHECK: %ret = select i1 true, i32 %n, i32 42
+; CHECK-SAME: !prof ![[BW_FOR_SELECT:[0-9]+]]
+ ret i32 %ret
+tail2:
+ ret i32 42
+}
+; CHECK: ![[ENTRY_COUNT]] = !{!"function_entry_count", i64 1}
+; CHECK: ![[BW_FOR_SELECT]] = !{!"branch_weights", i32 1, i32 0}
+
+;--- main.proftext
+:ir
+test_no_entry_block_counter
+431494656217155589
+3
+0
+0
+1
+
More information about the llvm-commits
mailing list