[all-commits] [llvm/llvm-project] 856a6a: [CSSPGO][llvm-profgen] Trim and merge context befo...

Wed Aug 11 16:03:04 PDT 2021

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 856a6a504165d6f46e9b29b463c19776db034794
      https://github.com/llvm/llvm-project/commit/856a6a504165d6f46e9b29b463c19776db034794
  Author: wlei <wlei at fb.com>
  Date:   2021-08-11 (Wed, 11 Aug 2021)

  Changed paths:
    M llvm/test/tools/llvm-profgen/merge-cold-profile.test
    M llvm/test/tools/llvm-profgen/recursion-compression-noprobe.test
    M llvm/test/tools/llvm-profgen/recursion-compression-pseudoprobe.test
    M llvm/tools/llvm-profgen/PerfReader.cpp
    M llvm/tools/llvm-profgen/ProfileGenerator.cpp
    M llvm/tools/llvm-profgen/ProfileGenerator.h
    M llvm/tools/llvm-profgen/ProfiledBinary.cpp

  Log Message:
  -----------
  [CSSPGO][llvm-profgen] Trim and merge context beforehand to reduce memory usage

Currently we use a centralized string map(StringMap<FunctionSamples> ProfileMap) to store the profile while populating the sample, which might cause the memory usage bottleneck. I saw in an extreme case, there are thousands of samples whose context stack depth is >= 100. The memory consumption can be greater than 100GB.

As here the context is used for inlining, we can assume we won't have so many of inlinees keeping inlined at the same root function, so this change tried to cap the context stack and merge the samples for peak memory reduction and this is done after recursion compression.

The default value is -1 meaning no depth limit, in the future we can tune to a smaller one.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D107800