[all-commits] [llvm/llvm-project] 856a6a: [CSSPGO][llvm-profgen] Trim and merge context befo...
ictwanglei via All-commits
all-commits at lists.llvm.org
Wed Aug 11 16:03:04 PDT 2021
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 856a6a504165d6f46e9b29b463c19776db034794
https://github.com/llvm/llvm-project/commit/856a6a504165d6f46e9b29b463c19776db034794
Author: wlei <wlei at fb.com>
Date: 2021-08-11 (Wed, 11 Aug 2021)
Changed paths:
M llvm/test/tools/llvm-profgen/merge-cold-profile.test
M llvm/test/tools/llvm-profgen/recursion-compression-noprobe.test
M llvm/test/tools/llvm-profgen/recursion-compression-pseudoprobe.test
M llvm/tools/llvm-profgen/PerfReader.cpp
M llvm/tools/llvm-profgen/ProfileGenerator.cpp
M llvm/tools/llvm-profgen/ProfileGenerator.h
M llvm/tools/llvm-profgen/ProfiledBinary.cpp
Log Message:
-----------
[CSSPGO][llvm-profgen] Trim and merge context beforehand to reduce memory usage
Currently we use a centralized string map(StringMap<FunctionSamples> ProfileMap) to store the profile while populating the sample, which might cause the memory usage bottleneck. I saw in an extreme case, there are thousands of samples whose context stack depth is >= 100. The memory consumption can be greater than 100GB.
As here the context is used for inlining, we can assume we won't have so many of inlinees keeping inlined at the same root function, so this change tried to cap the context stack and merge the samples for peak memory reduction and this is done after recursion compression.
The default value is -1 meaning no depth limit, in the future we can tune to a smaller one.
Reviewed By: hoy, wenlei
Differential Revision: https://reviews.llvm.org/D107800
More information about the All-commits
mailing list