[PATCH] D139603: [llvm-profdata] Add option to cap profile output size
Snehasish Kumar via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 20 16:30:50 PST 2022
snehasish added a comment.
In D139603#4000123 <https://reviews.llvm.org/D139603#4000123>, @huangjd wrote:
> The biggest challenge to compute the number of functions accurately is the compression in extbinary, because the compressed size is non-linear to the original size. Since profile samples and function names are written to different sections (and in CS profile the names are split into two sections and samples can also be split into two sections), there is no way to predict ahead the offset between them. Based on use cases, the current heuristic is under estimating how many functions to prune (and the last iteration typically converges to pruning 1 function) so it's unlikely to remove too many functions. (Note: Also tried using cubic equation for heuristic but that will remove too many functions, so the optimal heuristic is between O(n^2) and O(n^3))
I think it makes sense to start with straight-forward implementation and look for opportunities to refine it further in the future.
================
Comment at: llvm/include/llvm/ProfileData/SampleProfWriter.h:74
+ /// from llvm-profdata command line arguments. Ignore transient states (those
+ /// always being set by write() before use).
+ virtual void reset(std::unique_ptr<raw_ostream> &OS) {
----------------
It looks like derived classes must always call this function. Can you add a comment here for the future? Maybe something like `// This function must always be called by the overridden implementation`?
================
Comment at: llvm/lib/ProfileData/SampleProfWriter.cpp:55
+
+ template <typename value_type> void pwrite(value_type Val, size_t Offset) {
+ std::string StringBuf;
----------------
I think this should be ValueType based on the guidance here:
https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly
================
Comment at: llvm/test/tools/llvm-profdata/output-size-limit.test:5
+RUN: llvm-profdata merge --sample --text --output-size-limit=212 %p/Inputs/sample-profile.proftext | FileCheck %s --check-prefix=TEST_TEXT1
+TEST_TEXT1-DAG: main:184019:0
+TEST_TEXT1-DAG: 4: 534
----------------
I don't think we should use *-DAG here since that means the test will pass if the lines are reordered. However, if we reorder the symbol line with its contents the text format will be incorrect.
I think the -DAG directive is useful if there is non-determinism in the text format where profile information for a whole symbol may appear before another symbol. In this case we should separate out the CHECKs like the example in [1]. Though we should fix non-deterministic output if this is the case.
[1] https://llvm.org/docs/CommandGuide/FileCheck.html#the-check-dag-directive
================
Comment at: llvm/tools/llvm-profdata/llvm-profdata.cpp:981
+/// this class can be extended to satisfy such need.
+class FunctionPruningStrategy {
+ std::vector<NameFunctionSamples> Functions;
----------------
How about moving this to SampleProfileWriter so that we can use this API in other tooling where we don't invoke llvm-profdata?
================
Comment at: llvm/tools/llvm-profdata/llvm-profdata.cpp:1028
+ }
+ errs() << "Profile originally has " << OriginalFunctionCount
+ << " functions, reduced to " << ProfileMap.size() << " in "
----------------
Perhaps wrap this in DEBUG(), I'm not sure whether it's useful to have this message all the time.
================
Comment at: llvm/tools/llvm-profdata/llvm-profdata.cpp:1154
+ File.write(StringBuffer.data(), StringBuffer.size());
+ if ((EC = File.error()))
+ exitWithErrorCode(std::move(EC));
----------------
An if statement with initialization would make the intent clearer here:
```
if (EC = File.error(); EC) {
}
```
================
Comment at: llvm/tools/llvm-profdata/llvm-profdata.cpp:1301
+ cl::desc("Trim cold functions until profile size is below specified "
+ "limit in bytes. This uses a heursitic algorithm and functions "
+ "may be excessively trimmed"));
----------------
nit: Just heuristic instead of "heuristic algorithm"?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D139603/new/
https://reviews.llvm.org/D139603
More information about the llvm-commits
mailing list