[clang] [llvm] Update documentation and release notes for llvm-profgen COFF support (PR #84864)
Tim Creech via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 21 08:46:59 PDT 2024
https://github.com/tcreech-intel updated https://github.com/llvm/llvm-project/pull/84864
>From 4dc108d0d290ee5fd6a73c029c051fdb2215d00a Mon Sep 17 00:00:00 2001
From: Tim Creech <timothy.m.creech at intel.com>
Date: Mon, 11 Mar 2024 22:35:59 -0400
Subject: [PATCH 1/5] Update documentation and release notes for llvm-profgen
COFF support
This change:
- Updates the existing Clang User's Manual section on SPGO so that it
describes how to use llvm-profgen to perform SPGO on Windows. This is
new functionality implemented in #83972.
- Fixes a minor typo in the existing llvm-profgen invocation example.
- Adds an LLVM release note on this new functionality in llvm-profgen.
---
clang/docs/UsersManual.rst | 47 +++++++++++++++++++++++++++++++-------
llvm/docs/ReleaseNotes.rst | 5 ++++
2 files changed, 44 insertions(+), 8 deletions(-)
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 7391e4cf3a9aeb..9cf313c3727125 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -2410,20 +2410,35 @@ usual build cycle when using sample profilers for optimization:
1. Build the code with source line table information. You can use all the
usual build flags that you always build your application with. The only
- requirement is that you add ``-gline-tables-only`` or ``-g`` to the
- command line. This is important for the profiler to be able to map
- instructions back to source line locations.
+ requirement is that DWARF debug info including source line information is
+ generated. This DWARF information is important for the profiler to be able
+ to map instructions back to source line locations.
+
+ On Linux, ``-g`` or just ``-gline-tables-only`` is sufficient:
.. code-block:: console
$ clang++ -O2 -gline-tables-only code.cc -o code
+ It is also possible to include DWARF in Windows binaries:
+
+ .. code-block:: console
+
+ $ clang-cl -O2 -gdwarf -gline-tables-only coff-profile.cpp -fuse-ld=lld -link -debug:dwarf
+
2. Run the executable under a sampling profiler. The specific profiler
you use does not really matter, as long as its output can be converted
- into the format that the LLVM optimizer understands. Currently, there
- exists a conversion tool for the Linux Perf profiler
- (https://perf.wiki.kernel.org/), so these examples assume that you
- are using Linux Perf to profile your code.
+ into the format that the LLVM optimizer understands.
+
+ Two such profilers are the the Linux Perf profiler
+ (https://perf.wiki.kernel.org/) and Intel's Sampling Enabling Product (SEP),
+ available as part of `Intel VTune
+ <https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/vtune-profiler.html>`_.
+
+ The LLVM tool ``llvm-profgen`` can convert output of either Perf or SEP. An
+ external tool, AutoFDO, also supports Linux Perf output.
+
+ When using Perf:
.. code-block:: console
@@ -2434,6 +2449,15 @@ usual build cycle when using sample profilers for optimization:
it provides better call information, which improves the accuracy of
the profile data.
+ When using SEP:
+
+ .. code-block:: console
+
+ $ sep -start -ec BR_INST_RETIRED.NEAR_TAKEN:precise=yes:pdir -lbr no_filter:usr -perf-script ip,brstack -app ./code
+
+ This produces a ``perf.data.script`` output which can be used with
+ ``llvm-profgen``'s ``--perfscript`` input option.
+
3. Convert the collected profile data to LLVM's sample profile format.
This is currently supported via the AutoFDO converter ``create_llvm_prof``.
It is available at https://github.com/google/autofdo. Once built and
@@ -2454,7 +2478,14 @@ usual build cycle when using sample profilers for optimization:
.. code-block:: console
- $ llvm-profgen --binary=./code --output=code.prof--perfdata=perf.data
+ $ llvm-profgen --binary=./code --output=code.prof --perfdata=perf.data
+
+ When using SEP the output is in the textual format corresponding to
+ `llvm-profgen --perfscript`. For example:
+
+ .. code-block:: console
+
+ $ llvm-profgen --binary=./code --output=code.prof --perfscript=perf.data.script
4. Build the code again using the collected profile. This step feeds
diff --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index b34a5f31c5eb0a..c2bbc647bc18e6 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -157,6 +157,11 @@ Changes to the LLVM tools
``--set-symbols-visibility`` options for ELF input to change the
visibility of symbols.
+* llvm-profgen now supports COFF+DWARF binaries. This enables Sample-based PGO
+ on Windows using Intel VTune's SEP. For details on usage, see the `end-user
+ documentation for SPGO
+ <https://clang.llvm.org/docs/UsersManual.html#using-sampling-profilers>`_.
+
Changes to LLDB
---------------------------------
>From 53f4c5dc84d71fd4efa5384818ecfc3401a0e7f6 Mon Sep 17 00:00:00 2001
From: Tim Creech <timothy.m.creech at intel.com>
Date: Tue, 12 Mar 2024 09:14:27 -0400
Subject: [PATCH 2/5] fixup: improve sep usage example as suggested by Haohai
---
clang/docs/UsersManual.rst | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 9cf313c3727125..46d7687a101323 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -2453,9 +2453,9 @@ usual build cycle when using sample profilers for optimization:
.. code-block:: console
- $ sep -start -ec BR_INST_RETIRED.NEAR_TAKEN:precise=yes:pdir -lbr no_filter:usr -perf-script ip,brstack -app ./code
+ $ sep -start -out code.tb7 -ec BR_INST_RETIRED.NEAR_TAKEN:precise=yes:pdir -lbr no_filter:usr -perf-script brstack -app ./code
- This produces a ``perf.data.script`` output which can be used with
+ This produces a ``code.perf.data.script`` output which can be used with
``llvm-profgen``'s ``--perfscript`` input option.
3. Convert the collected profile data to LLVM's sample profile format.
>From a5e879ce5016fee9cf3109bb9fc7785c396ac509 Mon Sep 17 00:00:00 2001
From: Tim Creech <timothy.m.creech at intel.com>
Date: Tue, 12 Mar 2024 09:35:15 -0400
Subject: [PATCH 3/5] fixup: add suggested clarifications
---
clang/docs/UsersManual.rst | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 46d7687a101323..b5a063fa9ac3c2 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -2420,7 +2420,9 @@ usual build cycle when using sample profilers for optimization:
$ clang++ -O2 -gline-tables-only code.cc -o code
- It is also possible to include DWARF in Windows binaries:
+ While MSVC-style targets default to CodeView debug information, DWARF debug
+ information is required to generate source-level LLVM profiles. Use
+ ``-gdwarf`` to include DWARF debug information:
.. code-block:: console
@@ -2434,9 +2436,11 @@ usual build cycle when using sample profilers for optimization:
(https://perf.wiki.kernel.org/) and Intel's Sampling Enabling Product (SEP),
available as part of `Intel VTune
<https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/vtune-profiler.html>`_.
+ While Perf is Linux-specific, SEP can be used on Linux, Windows, and FreeBSD.
The LLVM tool ``llvm-profgen`` can convert output of either Perf or SEP. An
- external tool, AutoFDO, also supports Linux Perf output.
+ external project, `AutoFDO <https://github.com/google/autofdo>`_, also
+ provides a ``create_llvm_prof`` tool which supports Linux Perf output.
When using Perf:
@@ -2458,11 +2462,10 @@ usual build cycle when using sample profilers for optimization:
This produces a ``code.perf.data.script`` output which can be used with
``llvm-profgen``'s ``--perfscript`` input option.
-3. Convert the collected profile data to LLVM's sample profile format.
- This is currently supported via the AutoFDO converter ``create_llvm_prof``.
- It is available at https://github.com/google/autofdo. Once built and
- installed, you can convert the ``perf.data`` file to LLVM using
- the command:
+3. Convert the collected profile data to LLVM's sample profile format. This is
+ currently supported via the `AutoFDO <https://github.com/google/autofdo>`_
+ converter ``create_llvm_prof``. Once built and installed, you can convert
+ the ``perf.data`` file to LLVM using the command:
.. code-block:: console
>From 712688ee0081c7bf0fced6b5bcd59c09bbad6e29 Mon Sep 17 00:00:00 2001
From: Tim Creech <timothy.m.creech at intel.com>
Date: Tue, 12 Mar 2024 09:37:27 -0400
Subject: [PATCH 4/5] fixup: fix a rst syntax issue
---
clang/docs/UsersManual.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index b5a063fa9ac3c2..b83e6deac75e3f 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -2484,7 +2484,7 @@ usual build cycle when using sample profilers for optimization:
$ llvm-profgen --binary=./code --output=code.prof --perfdata=perf.data
When using SEP the output is in the textual format corresponding to
- `llvm-profgen --perfscript`. For example:
+ ``llvm-profgen --perfscript``. For example:
.. code-block:: console
>From a562c41d3da37782b66a58a202dce9586ccb9a66 Mon Sep 17 00:00:00 2001
From: Tim Creech <timothy.m.creech at intel.com>
Date: Tue, 12 Mar 2024 11:02:35 -0400
Subject: [PATCH 5/5] fixup: fix filename
---
clang/docs/UsersManual.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index b83e6deac75e3f..d872d6d4d7c558 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -2488,7 +2488,7 @@ usual build cycle when using sample profilers for optimization:
.. code-block:: console
- $ llvm-profgen --binary=./code --output=code.prof --perfscript=perf.data.script
+ $ llvm-profgen --binary=./code --output=code.prof --perfscript=code.perf.data.script
4. Build the code again using the collected profile. This step feeds
More information about the llvm-commits
mailing list