[all-commits] [llvm/llvm-project] 30b023: [CSSPGO][llvm-profgen] Context-sensitive global pr...

WenleiHe via All-commits all-commits at lists.llvm.org
Mon Mar 29 09:54:24 PDT 2021


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 30b023233696f044427d6c3ae6c0e290e3ef1aa0
      https://github.com/llvm/llvm-project/commit/30b023233696f044427d6c3ae6c0e290e3ef1aa0
  Author: Wenlei He <wenlei at fb.com>
  Date:   2021-03-29 (Mon, 29 Mar 2021)

  Changed paths:
    A llvm/include/llvm/Transforms/IPO/ProfiledCallGraph.h
    M llvm/include/llvm/Transforms/IPO/SampleContextTracker.h
    M llvm/lib/Transforms/IPO/SampleContextTracker.cpp
    M llvm/lib/Transforms/IPO/SampleProfile.cpp
    A llvm/test/tools/llvm-profgen/cs-preinline.test
    M llvm/tools/llvm-profgen/CMakeLists.txt
    A llvm/tools/llvm-profgen/CSPreInliner.cpp
    A llvm/tools/llvm-profgen/CSPreInliner.h
    M llvm/tools/llvm-profgen/ProfileGenerator.cpp
    M llvm/tools/llvm-profgen/ProfileGenerator.h

  Log Message:
  -----------
  [CSSPGO][llvm-profgen] Context-sensitive global pre-inliner

This change sets up a framework in llvm-profgen to estimate inline decision and adjust context-sensitive profile based on that. We call it a global pre-inliner in llvm-profgen.

It will serve two purposes:
  1) Since context profile for not inlined context will be merged into base profile, if we estimate a context will not be inlined, we can merge the context profile in the output to save profile size.
  2) For thinLTO, when a context involving functions from different modules is not inined, we can't merge functions profiles across modules, leading to suboptimal post-inline count quality. By estimating some inline decisions, we would be able to adjust/merge context profiles beforehand as a mitigation.

Compiler inline heuristic uses inline cost which is not available in llvm-profgen. But since inline cost is closely related to size, we could get an estimate through function size from debug info. Because the size we have in llvm-profgen is the final size, it could also be more accurate than the inline cost estimation in the compiler.

This change only has the framework, with a few TODOs left for follow up patches for a complete implementation:
  1) We need to retrieve size for funciton//inlinee from debug info for inlining estimation. Currently we use number of samples in a profile as place holder for size estimation.
  2) Currently the thresholds are using the values used by sample loader inliner. But they need to be tuned since the size here is fully optimized machine code size, instead of inline cost based on not yet fully optimized IR.

Differential Revision: https://reviews.llvm.org/D99146




More information about the All-commits mailing list