[llvm] [clang] [PGO][nfc] Add `-fdiagnostics-show-profile-count` option to show real loop count from instr-profile (PR #75021)
via cfe-commits
cfe-commits at lists.llvm.org
Sun Dec 10 19:22:38 PST 2023
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-llvm-ir
Author: Elvis Wang (ElvisWang123)
<details>
<summary>Changes</summary>
The original `-fdiagnostics-show-hotness` option show the relative number of the loop count which is calculate by the `function_entry_count` and `branch_frequency`. We want to know the real loop iteration count in the remark which is collect in the instrument profile, adding a new option to expose the new feature.
- Add a new metadata `MD_prof_count` which contains the runtime loop iterations count. For example:
```
loop.header:
...
br i1 %0, label %true, label %false, !prof.count !0
...
!0 = !{!"profile_count", !i64 0}
```
- If option `-fdiagnostics-show-profile-count` is set we will append the `MD_prof_count` metadata at the branch instruction at the header of loops.
- Show the profile count like hotness with remark. For example:
```
remark: the cost-model indicates that interleaving is not beneficial
(ProfileCount: 20) [-Rpass-analysis=loop-vectorize]
38 | for(int i = 0; i < argc % 20; i++){
| ^
```
---
Patch is 25.56 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/75021.diff
24 Files Affected:
- (modified) clang/docs/UsersManual.rst (+27)
- (modified) clang/include/clang/Basic/DiagnosticDriverKinds.td (+3)
- (modified) clang/include/clang/Driver/Options.td (+3)
- (modified) clang/lib/CodeGen/CGStmt.cpp (+23-4)
- (modified) clang/lib/CodeGen/CodeGenAction.cpp (+4)
- (modified) clang/lib/CodeGen/CodeGenFunction.h (+1)
- (modified) clang/lib/CodeGen/CodeGenPGO.cpp (+12)
- (modified) clang/lib/Driver/ToolChains/Clang.cpp (+5)
- (modified) clang/lib/Frontend/CompilerInvocation.cpp (+6)
- (added) clang/test/Frontend/Inputs/optimization-remark-with-profile-count.proftext (+9)
- (added) clang/test/Frontend/optimization-remark-with-profile-count-new-pm.c (+41)
- (added) clang/test/Profile/Inputs/c-profile-count-metadata.proftext (+32)
- (added) clang/test/Profile/c-profile-count-metadata.c (+73)
- (modified) llvm/docs/LangRef.rst (+22)
- (modified) llvm/include/llvm/Analysis/OptimizationRemarkEmitter.h (+7)
- (modified) llvm/include/llvm/IR/DiagnosticInfo.h (+6)
- (modified) llvm/include/llvm/IR/FixedMetadataKinds.def (+1)
- (modified) llvm/include/llvm/IR/MDBuilder.h (+3)
- (modified) llvm/include/llvm/Remarks/Remark.h (+4)
- (modified) llvm/lib/Analysis/OptimizationRemarkEmitter.cpp (+22)
- (modified) llvm/lib/IR/LLVMRemarkStreamer.cpp (+1)
- (modified) llvm/lib/IR/MDBuilder.cpp (+10)
- (modified) llvm/lib/Remarks/Remark.cpp (+2)
- (modified) llvm/lib/Remarks/YAMLRemarkSerializer.cpp (+5-2)
``````````diff
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index f1b344ef5109b5..59e2bdbd025000 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -425,6 +425,33 @@ output format of the diagnostics that it generates.
If this option is not used, all the passes are included in the optimization
record.
+.. option:: -fdiagnostics-show-profile-count
+
+ Enable profile loop count information in diagnostic line.
+
+ This option controls whether Clang prints the profile loop count associated
+ with diagnostics in the presence of profile-guided optimization information.
+ This is currently supported with optimization remarks (see
+ :ref:`Options to Emit Optimization Reports <rpass>`). The profile count information
+ allows users to focus on the hot optimization remarks that are likely to be
+ more relevant for run-time performance. The main difference between profile count
+ the hotness is the profile count is the real profile count from the runtime
+ profile and hotness is a relative number calculated by function entry count and
+ weight.
+
+ For example, in this output, the block containing the callsite of `foo` was
+ executed 3000 times according to the profile data:
+
+ ::
+
+ s.c:7:10: remark: foo inlined into bar (ProfileCount: 3000) [-Rpass-analysis=inline]
+ sum += foo(x, x - 2);
+ ^
+
+ This option is implied when
+ :ref:`-fsave-optimization-record <opt_fsave-optimization-record>` is used.
+ Otherwise, it defaults to off.
+
.. _opt_fdiagnostics-show-hotness:
.. option:: -f[no-]diagnostics-show-hotness
diff --git a/clang/include/clang/Basic/DiagnosticDriverKinds.td b/clang/include/clang/Basic/DiagnosticDriverKinds.td
index 676f1a62b49dd0..47ad1e058a1d82 100644
--- a/clang/include/clang/Basic/DiagnosticDriverKinds.td
+++ b/clang/include/clang/Basic/DiagnosticDriverKinds.td
@@ -420,6 +420,9 @@ def warn_drv_empty_joined_argument : Warning<
def warn_drv_diagnostics_hotness_requires_pgo : Warning<
"argument '%0' requires profile-guided optimization information">,
InGroup<UnusedCommandLineArgument>;
+def warn_drv_diagnostics_profile_count_requires_pgo : Warning<
+ "argument '%0' requires profile-guided optimization information">,
+ InGroup<UnusedCommandLineArgument>;
def warn_drv_diagnostics_misexpect_requires_pgo : Warning<
"argument '%0' requires profile-guided optimization information">,
InGroup<UnusedCommandLineArgument>;
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index b959fd20fe413d..78914e88350a5f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1933,6 +1933,9 @@ defm diagnostics_show_hotness : BoolFOption<"diagnostics-show-hotness",
PosFlag<SetTrue, [], [ClangOption, CC1Option],
"Enable profile hotness information in diagnostic line">,
NegFlag<SetFalse>>;
+def fdiagnostics_show_profile_count : Flag<["-"], "fdiagnostics-show-profile-count">,
+ Group<f_clang_Group>, Visibility<[ClangOption, CC1Option]>,
+ HelpText<"Show the real loop counts from the runtime profile">;
def fdiagnostics_hotness_threshold_EQ : Joined<["-"], "fdiagnostics-hotness-threshold=">,
Group<f_Group>, Visibility<[ClangOption, CC1Option]>,
MetaVarName<"<value>">,
diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp
index a5cb80640641bb..7cbb6e10cb1382 100644
--- a/clang/lib/CodeGen/CGStmt.cpp
+++ b/clang/lib/CodeGen/CGStmt.cpp
@@ -32,6 +32,7 @@
#include "llvm/IR/DataLayout.h"
#include "llvm/IR/InlineAsm.h"
#include "llvm/IR/Intrinsics.h"
+#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/MDBuilder.h"
#include "llvm/Support/SaveAndRestore.h"
#include <optional>
@@ -923,7 +924,12 @@ void CodeGenFunction::EmitWhileStmt(const WhileStmt &S,
if (!Weights && CGM.getCodeGenOpts().OptimizationLevel)
BoolCondVal = emitCondLikelihoodViaExpectIntrinsic(
BoolCondVal, Stmt::getLikelihood(S.getBody()));
- Builder.CreateCondBr(BoolCondVal, LoopBody, ExitBlock, Weights);
+ auto *Branch =
+ Builder.CreateCondBr(BoolCondVal, LoopBody, ExitBlock, Weights);
+ // Appending the profle count metadata on the Branch instruction for the
+ // profile count
+ Branch->setMetadata(llvm::LLVMContext::MD_prof_count,
+ createProfileCount(getProfileCount(S.getBody())));
if (ExitBlock != LoopExit.getBlock()) {
EmitBlock(ExitBlock);
@@ -1014,9 +1020,13 @@ void CodeGenFunction::EmitDoStmt(const DoStmt &S,
// As long as the condition is true, iterate the loop.
if (EmitBoolCondBranch) {
uint64_t BackedgeCount = getProfileCount(S.getBody()) - ParentCount;
- Builder.CreateCondBr(
+ auto *Branch = Builder.CreateCondBr(
BoolCondVal, LoopBody, LoopExit.getBlock(),
createProfileWeightsForLoop(S.getCond(), BackedgeCount));
+ // Appending the profile count metadata on the Branch instruction for the
+ // profile count
+ Branch->setMetadata(llvm::LLVMContext::MD_prof_count,
+ createProfileCount(getProfileCount(S.getBody())));
}
LoopStack.pop();
@@ -1104,7 +1114,12 @@ void CodeGenFunction::EmitForStmt(const ForStmt &S,
BoolCondVal = emitCondLikelihoodViaExpectIntrinsic(
BoolCondVal, Stmt::getLikelihood(S.getBody()));
- Builder.CreateCondBr(BoolCondVal, ForBody, ExitBlock, Weights);
+ auto *Branch =
+ Builder.CreateCondBr(BoolCondVal, ForBody, ExitBlock, Weights);
+ // Appending the profile count metadata on the Branch instruction for the
+ // profile count
+ Branch->setMetadata(llvm::LLVMContext::MD_prof_count,
+ createProfileCount(getProfileCount(S.getBody())));
if (ExitBlock != LoopExit.getBlock()) {
EmitBlock(ExitBlock);
@@ -1188,7 +1203,11 @@ CodeGenFunction::EmitCXXForRangeStmt(const CXXForRangeStmt &S,
if (!Weights && CGM.getCodeGenOpts().OptimizationLevel)
BoolCondVal = emitCondLikelihoodViaExpectIntrinsic(
BoolCondVal, Stmt::getLikelihood(S.getBody()));
- Builder.CreateCondBr(BoolCondVal, ForBody, ExitBlock, Weights);
+ auto *Branch = Builder.CreateCondBr(BoolCondVal, ForBody, ExitBlock, Weights);
+ // Appending the profile count metadata on the Branch instruction for the
+ // profile count
+ Branch->setMetadata(llvm::LLVMContext::MD_prof_count,
+ createProfileCount(getProfileCount(S.getBody())));
if (ExitBlock != LoopExit.getBlock()) {
EmitBlock(ExitBlock);
diff --git a/clang/lib/CodeGen/CodeGenAction.cpp b/clang/lib/CodeGen/CodeGenAction.cpp
index bb6b1a3bc228cf..fbbd5bf25898e4 100644
--- a/clang/lib/CodeGen/CodeGenAction.cpp
+++ b/clang/lib/CodeGen/CodeGenAction.cpp
@@ -716,6 +716,10 @@ void BackendConsumer::EmitOptimizationMessage(
if (D.getHotness())
MsgStream << " (hotness: " << *D.getHotness() << ")";
+ if (D.getProfileCount()) {
+ MsgStream << " (ProfileCount: " << *D.getProfileCount() << ")";
+ }
+
Diags.Report(Loc, DiagID)
<< AddFlagValue(D.getPassName())
<< MsgStream.str();
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index 618e78809db408..12cb0645f2f576 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -1527,6 +1527,7 @@ class CodeGenFunction : public CodeGenTypeCache {
llvm::MDNode *createProfileWeights(ArrayRef<uint64_t> Weights) const;
llvm::MDNode *createProfileWeightsForLoop(const Stmt *Cond,
uint64_t LoopCount) const;
+ llvm::MDNode *createProfileCount(uint64_t Count) const;
public:
/// Increment the profiler's counter for the given statement by \p StepV.
diff --git a/clang/lib/CodeGen/CodeGenPGO.cpp b/clang/lib/CodeGen/CodeGenPGO.cpp
index 81bf8ea696b164..8752f8e8269b88 100644
--- a/clang/lib/CodeGen/CodeGenPGO.cpp
+++ b/clang/lib/CodeGen/CodeGenPGO.cpp
@@ -23,6 +23,11 @@
#include "llvm/Support/MD5.h"
#include <optional>
+static llvm::cl::opt<bool> ClEnableProfileCountMetadata(
+ "enable-profile-count-metadata",
+ llvm::cl::desc("Appending real executation count of loops from runtime"),
+ llvm::cl::Hidden, llvm::cl::init(false));
+
static llvm::cl::opt<bool>
EnableValueProfiling("enable-value-profiling",
llvm::cl::desc("Enable value profiling"),
@@ -1122,3 +1127,10 @@ CodeGenFunction::createProfileWeightsForLoop(const Stmt *Cond,
return createProfileWeights(LoopCount,
std::max(*CondCount, LoopCount) - LoopCount);
}
+
+llvm::MDNode *CodeGenFunction::createProfileCount(uint64_t Count) const {
+ if (!PGO.haveRegionCounts() || !ClEnableProfileCountMetadata)
+ return nullptr;
+ llvm::MDBuilder MDHelper(CGM.getLLVMContext());
+ return MDHelper.createProfileCount(Count);
+}
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index eb26bfade47b7a..0dfa40cf92f288 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -4065,6 +4065,11 @@ static void RenderDiagnosticsOptions(const Driver &D, const ArgList &Args,
Args.addOptOutFlag(CmdArgs, options::OPT_fspell_checking,
options::OPT_fno_spell_checking);
+
+ // Show iteration counts of loops by runtime profile.
+ if (Args.hasArg(options::OPT_fdiagnostics_show_profile_count))
+ CmdArgs.append({"-fdiagnostics-show-profile-count", "-mllvm",
+ "-enable-profile-count-metadata"});
}
DwarfFissionKind tools::getDebugFissionKind(const Driver &D,
diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp
index b33bdad2ad81ba..11568eb2bdbd4d 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -2091,6 +2091,12 @@ bool CompilerInvocation::ParseCodeGenArgs(CodeGenOptions &Opts, ArgList &Args,
bool UsingProfile =
UsingSampleProfile || !Opts.ProfileInstrumentUsePath.empty();
+ if (Args.hasArg(options::OPT_fdiagnostics_show_profile_count) &&
+ !UsingProfile && IK.getLanguage() != Language::LLVM_IR) {
+ Diags.Report(diag::warn_drv_diagnostics_profile_count_requires_pgo)
+ << "-fdiagnostics-show-profile-count";
+ }
+
if (Opts.DiagnosticsWithHotness && !UsingProfile &&
// An IR file will contain PGO as metadata
IK.getLanguage() != Language::LLVM_IR)
diff --git a/clang/test/Frontend/Inputs/optimization-remark-with-profile-count.proftext b/clang/test/Frontend/Inputs/optimization-remark-with-profile-count.proftext
new file mode 100644
index 00000000000000..1885f75326578a
--- /dev/null
+++ b/clang/test/Frontend/Inputs/optimization-remark-with-profile-count.proftext
@@ -0,0 +1,9 @@
+main
+# Func Hash:
+1160280
+# Num Counters:
+2
+# Counter Values:
+1
+20
+
diff --git a/clang/test/Frontend/optimization-remark-with-profile-count-new-pm.c b/clang/test/Frontend/optimization-remark-with-profile-count-new-pm.c
new file mode 100644
index 00000000000000..0cfe95a1a90d4c
--- /dev/null
+++ b/clang/test/Frontend/optimization-remark-with-profile-count-new-pm.c
@@ -0,0 +1,41 @@
+// Testing the remark output of the `-fdiagnostics-show-profile-count`.
+
+// Generate instrumentation and sampling profile data.
+// RUN: llvm-profdata merge \
+// RUN: %S/Inputs/optimization-remark-with-profile-count.proftext \
+// RUN: -o %t.profdata
+//
+// RUN: %clang -fprofile-instr-use=%t.profdata \
+// RUN: -O2 -Rpass=loop-vectorize -Rpass-analysis=loop-vectorize \
+// RUN: -Rpass-missed=loop-vecotrize \
+// RUN: -fdiagnostics-show-profile-count \
+// RUN: 2>&1 %s\
+// RUN: | FileCheck -check-prefix=SHOW_PROFILE_COUNT %s
+// RUN: %clang -fprofile-instr-use=%t.profdata \
+// RUN: -O2 -Rpass=loop-vectorize -Rpass-analysis=loop-vectorize \
+// RUN: -Rpass-missed=loop-vecotrize \
+// RUN: -fdiagnostics-show-profile-count -fdiagnostics-show-hotness \
+// RUN: 2>&1 %s\
+// RUN: | FileCheck -check-prefix=SHOW_PROFILE_COUNT_AND_HOTNESS %s
+// RUN: %clang \
+// RUN: -O2 -Rpass=loop-vectorize -Rpass-analysis=loop-vectorize \
+// RUN: -Rpass-missed=loop-vecotrize \
+// RUN: -fdiagnostics-show-profile-count \
+// RUN: 2>&1 %s\
+// RUN: | FileCheck -check-prefix=NO_PGO %s
+
+int sum = 0;
+int x[20] = {0, 112, 32, 11, 99, 88, 99, 88,34, 342, 85,99, 43, 75, 71, 871, 84, 65, 37, 98};
+
+// SHOW_PROFILE_COUNT_AND_HOTNESS: hotness: {{[0-9]+}}
+// SHOW_PROFILE_COUNT_AND_HOTNESS: ProfileCount: {{[0-9]+}}
+// SHOW_PROFILE_COUNT: ProfileCount: {{[0-9]+}}
+// NO_PGO: argument '-fdiagnostics-show-profile-count' requires profile-guided optimization information
+int main(int argc, const char *argv[]) {
+#pragma clang loop vectorize(enable)
+ for(int i = 0; i < argc % 20; i++){
+ sum += x[i];
+ sum += argc;
+ }
+ return sum;
+}
diff --git a/clang/test/Profile/Inputs/c-profile-count-metadata.proftext b/clang/test/Profile/Inputs/c-profile-count-metadata.proftext
new file mode 100644
index 00000000000000..d880663fed32d9
--- /dev/null
+++ b/clang/test/Profile/Inputs/c-profile-count-metadata.proftext
@@ -0,0 +1,32 @@
+never_called
+6820425066224770721
+9
+0
+0
+0
+0
+0
+0
+0
+0
+0
+
+main
+24
+1
+1
+
+dead_code
+5254464978620792806
+10
+1
+0
+0
+0
+0
+0
+0
+0
+0
+0
+
diff --git a/clang/test/Profile/c-profile-count-metadata.c b/clang/test/Profile/c-profile-count-metadata.c
new file mode 100644
index 00000000000000..99089cd5da3600
--- /dev/null
+++ b/clang/test/Profile/c-profile-count-metadata.c
@@ -0,0 +1,73 @@
+// Copy from c-unprofiled-blocks.c but testing the `MD_prof_count`, which will
+// will generate the MD node no matter the code is dead or not.
+
+// RUN: llvm-profdata merge %S/Inputs/c-profile-count-metadata.proftext -o %t.profdata
+// RUN: %clang_cc1 -mllvm -enable-profile-count-metadata -triple x86_64-apple-macosx10.9 \
+// RUN: -main-file-name c-profile-count-metadata.c %s -o - \
+// RUN: -emit-llvm -fprofile-instrument-use-path=%t.profdata | FileCheck -check-prefix=PGOUSE %s
+
+// PGOUSE-LABEL: @never_called(i32 noundef %i)
+int never_called(int i) {
+ // PGOUSE: br i1 %{{[^,]*}}, label %{{[^,]*}}, label %{{[^,]*}}{{$}}
+ if (i) {}
+
+ // PGOUSE: br i1 %{{[^,]*}}, label %{{[^,]*}}, label %{{[^,]*}}, !prof.count !{{[0-9]+}}{{$}}
+ for (i = 0; i < 100; ++i) {
+ }
+
+ // PGOUSE: br i1 %{{[^,]*}}, label %{{[^,]*}}, label %{{[^,]*}}, !prof.count !{{[0-9]+}}{{$}}
+ while (--i) {}
+
+ // PGOUSE: br i1 %{{[^,]*}}, label %{{[^,]*}}, label %{{[^,]*}}, !llvm.loop [[LOOP1:!.*]], !prof.count !{{[0-9]+}}
+ do {} while (i++ < 75);
+
+ // PGOUSE: switch {{.*}} [
+ // PGOUSE-NEXT: i32 12
+ // PGOUSE-NEXT: i32 82
+ // PGOUSE-NEXT: ]{{$}}
+ switch (i) {
+ case 12: return 3;
+ case 82: return 0;
+ default: return 89;
+ }
+}
+
+// PGOUSE-LABEL: @dead_code(i32 noundef %i)
+int dead_code(int i) {
+ // PGOUSE: br {{.*}}, !prof !{{[0-9]+}}
+ if (i) {
+ // This branch is never reached.
+
+ // PGOUSE: br i1 %{{[^,]*}}, label %{{[^,]*}}, label %{{[^,]*}}{{$}}
+ if (!i) {}
+
+ // PGOUSE: br i1 %{{[^,]*}}, label %{{[^,]*}}, label %{{[^,]*}}, !prof.count !{{[0-9]+}}{{$}}
+ for (i = 0; i < 100; ++i) {
+ }
+
+ // PGOUSE: br i1 %{{[^,]*}}, label %{{[^,]*}}, label %{{[^,]*}}, !prof.count !{{[0-9]+}}{{$}}
+ while (--i) {}
+
+ // PGOUSE: br i1 %{{[^,]*}}, label %{{[^,]*}}, label %{{[^,]*}}, !llvm.loop [[LOOP2:!.*]], !prof.count !{{[0-9]+}}
+ do {} while (i++ < 75);
+
+ // PGOUSE: switch {{.*}} [
+ // PGOUSE-NEXT: i32 12
+ // PGOUSE-NEXT: i32 82
+ // PGOUSE-NEXT: ]{{$}}
+ switch (i) {
+ case 12: return 3;
+ case 82: return 0;
+ default: return 89;
+ }
+ }
+ return 2;
+}
+
+// PGOUSE-LABEL: @main(i32 noundef %argc, ptr noundef %argv)
+int main(int argc, const char *argv[]) {
+ dead_code(0);
+ return 0;
+}
+
+// PGOUSE: !{{[0-9]+}} = !{!"profile_count", i64 0}
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index adda52b33c789b..7bf7f0728f1879 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -7649,6 +7649,28 @@ allocation itself) to the outermost callsite context required for uniquely
identifying the described profile behavior (note this may not be the top of
the profiled call stack).
+.. _md_prof_count:
+
+'``prof.count``' Metadata
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``prof.count`` metadata is used to record runtime profile data on loop
+executation times. The difference between ``prof`` metadata is this metadata record
+the real execution time in the runtime profile.
+
+Example:
+
+.. code-block:: text
+
+ for.header:
+ ...
+ br i1 %1, label %for.body, label %for.exit, !prof.count !0
+ !0 = !{!"profile_count", !64 100}
+
+Each of the ``prof.count`` metadata contains two element, the first element is the
+string indicate this metadata is for profile_count, and the second element is the
+executation times current loop runs at the runtime.
+
.. _md_callsite:
'``callsite``' Metadata
diff --git a/llvm/include/llvm/Analysis/OptimizationRemarkEmitter.h b/llvm/include/llvm/Analysis/OptimizationRemarkEmitter.h
index 8aaeaf29910293..e7daf72502877d 100644
--- a/llvm/include/llvm/Analysis/OptimizationRemarkEmitter.h
+++ b/llvm/include/llvm/Analysis/OptimizationRemarkEmitter.h
@@ -118,6 +118,13 @@ class OptimizationRemarkEmitter {
/// available.
std::optional<uint64_t> computeHotness(const Value *V);
+ /// Compute profile count from IR value (currently assumed to be a block) if
+ /// PGO is available.
+ std::optional<uint64_t> getProfileCount(const Value *V);
+
+ /// Similar but use value from \p OptDiag and update profile count there.
+ void getProfileCount(DiagnosticInfoIROptimization &OptDiag);
+
/// Similar but use value from \p OptDiag and update hotness there.
void computeHotness(DiagnosticInfoIROptimization &OptDiag);
diff --git a/llvm/include/llvm/IR/DiagnosticInfo.h b/llvm/include/llvm/IR/DiagnosticInfo.h
index 628445fe9fb2cc..125f9302b75edc 100644
--- a/llvm/include/llvm/IR/DiagnosticInfo.h
+++ b/llvm/include/llvm/IR/DiagnosticInfo.h
@@ -481,6 +481,8 @@ class DiagnosticInfoOptimizationBase : public DiagnosticInfoWithLocationBase {
std::string getMsg() const;
std::optional<uint64_t> getHotness() const { return Hotness; }
void setHotness(std::optional<uint64_t> H) { Hotness = H; }
+ std::optional<uint64_t> getProfileCount() const { return ProfileCount; }
+ void setProfileCount(std::optional<uint64_t> Count) { ProfileCount = Count; }
bool isVerbose() const { return IsVerbose; }
@@ -523,6 +525,10 @@ class DiagnosticInfoOptimizationBase : public DiagnosticInfoWithLocationBase {
/// corresponding code was executed in a profile instrumentation run.
std::optional<uint64_t> Hotness;
+ /// If profile information is available, this is the REAL number of times the
+ /// corresponding code was executed in a profile instrumentation run.
+ std::optional<uint64_t> ProfileCount;
+
/// Arguments collected via the streaming interface.
SmallVector<Argument, 4> Args;
diff --git a/llvm/include/llvm/IR/FixedMetadataKinds.def b/llvm/include/llvm/IR/FixedMetadataKinds.def
index b375d0f0912060..20abd5f19c8e3a 100644
--- a/llvm/include/llvm/IR/FixedMetadataKinds.def
+++ b/llvm/include/llvm/IR/FixedMetadataKinds.def
@@ -51,3 +51,4 @@ LLVM_FIXED_MD_KIND(MD_kcfi_type, "kcfi_type", 36)
LLVM_FIXED_MD_KIND(MD_pcsections, "pcsections", 37)
LLVM_FIXED_MD_KIND(MD_DIAssignID, "DIAssignID", 38)
LLVM_FIXED_MD_KIND(MD_coro_outside_frame, "coro.outside.frame", 39)
+LLVM_FIXED_MD_KIND(MD_prof_count, "prof.count", 40)
diff --git a/llvm/include/llvm/IR/MDBuilder.h b/llvm/include/llvm/IR/MDBuilder.h
index 39165453de16b0..adfc733baa67b1 100644
--- a/llvm/include/llvm/IR/MDBuilder.h
+++ b/llvm/include/llvm/IR/MDBuilder.h
@@ -67,6 +67,9 @@ class MDBuilder {
/// Return metadata specifying that a branch or switch is unpredictable.
MDNode *createUnpredictable();
+ /// Create the `profile_count` metadata at the branch instrucion
+ MDNode *createProfileCount(uint64_t Count);
+
/// Return metadata containing the entry \p Count for a function, a boolean
/// \Synthetic indicating whether the counts were synthetized, and the
/// GUIDs stored in \p Imports that need to be imported for sample PGO, to
diff --git a/llvm/include/llvm/Remarks/Remark.h b/llvm/include/llvm/Remarks/Remark.h
in...
[truncated]
``````````
</details>
https://github.com/llvm/llvm-project/pull/75021
More information about the cfe-commits
mailing list