[all-commits] [llvm/llvm-project] b1ca2a: [PGO] Sampled instrumentation in PGO to speed up i...
xur-llvm via All-commits
all-commits at lists.llvm.org
Mon Jul 22 09:19:39 PDT 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: b1ca2a954643d2c07d5308297d1f2b911f794ba4
https://github.com/llvm/llvm-project/commit/b1ca2a954643d2c07d5308297d1f2b911f794ba4
Author: xur-llvm <59886942+xur-llvm at users.noreply.github.com>
Date: 2024-07-22 (Mon, 22 Jul 2024)
Changed paths:
M llvm/include/llvm/ProfileData/InstrProfData.inc
M llvm/include/llvm/Transforms/Instrumentation.h
M llvm/include/llvm/Transforms/Instrumentation/PGOInstrumentation.h
M llvm/lib/Passes/PassBuilderPipelines.cpp
M llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp
M llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
A llvm/test/Transforms/PGOProfile/Inputs/cspgo_bar_sample.ll
A llvm/test/Transforms/PGOProfile/counter_promo_sampling.ll
A llvm/test/Transforms/PGOProfile/cspgo_sample.ll
A llvm/test/Transforms/PGOProfile/instrprof_burst_sampling_fast.ll
A llvm/test/Transforms/PGOProfile/instrprof_burst_sampling_full.ll
A llvm/test/Transforms/PGOProfile/instrprof_burst_sampling_full_intsize.ll
A llvm/test/Transforms/PGOProfile/instrprof_simple_sampling.ll
Log Message:
-----------
[PGO] Sampled instrumentation in PGO to speed up instrumentation binary (#69535)
In comparison to non-instrumented binaries, PGO instrumentation binaries
can be significantly slower. For highly threaded programs, this slowdown
can
reach 10x due to data races or false sharing within counters.
This patch incorporates sampling into the PGO instrumentation process to
enhance the speed of instrumentation binaries. The fundamental concept
is similar to the one proposed in https://reviews.llvm.org/D63949.
Three sampling modes are introduced:
1. Simple Sampling: When '-sampled-instr-bust-duration' is set to 1.
2. Fast Burst Sampling: When not using simple sampling, and
'-sampled-instr-period' is set to 65535. This is the default mode of
sampling.
3. Full Burst Sampling: When neither simple nor fast burst sampling is
used.
Utilizing this sampled instrumentation significantly improves the
binary's
execution speed. Measurements show up to 5x speedup with default
settings. Fast burst sampling now results in only around 20% to 30%
slowdown (compared to 8 to 10x slowdown without sampling).
Out tests show that profile quality remains good with sampling,
with edge counts typically showing more than 90% overlap.
For applications whose behavior changes due to binary speed,
sampling instrumentation can enhance performance.
Observations have shown some apps experiencing up to
a ~2% improvement in PGO.
A potential drawback of this patch is the increased binary size
and compilation time. The Sampling method in this patch does
not improve single threaded program instrumentation binary
speed.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list