[all-commits] [llvm/llvm-project] 5d7950: [CSSPGO][llvm-profgen] Missing frame inference.

Hongtao Yu via All-commits all-commits at lists.llvm.org
Fri Dec 16 08:45:14 PST 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 5d7950a403bec25e52d4d0ba300b009877073c05
  Author: Hongtao Yu <hoy at fb.com>
  Date:   2022-12-16 (Fri, 16 Dec 2022)

  Changed paths:
    A llvm/test/tools/llvm-profgen/Inputs/noinline-tailcall-probe.perfbin
    A llvm/test/tools/llvm-profgen/Inputs/noinline-tailcall-probe.perfscript
    A llvm/test/tools/llvm-profgen/cs-tailcall.test
    M llvm/tools/llvm-profgen/CMakeLists.txt
    A llvm/tools/llvm-profgen/MissingFrameInferrer.cpp
    A llvm/tools/llvm-profgen/MissingFrameInferrer.h
    M llvm/tools/llvm-profgen/ProfileGenerator.cpp
    M llvm/tools/llvm-profgen/ProfileGenerator.h
    M llvm/tools/llvm-profgen/ProfiledBinary.cpp
    M llvm/tools/llvm-profgen/ProfiledBinary.h

  Log Message:
  [CSSPGO][llvm-profgen] Missing frame inference.

This change introduces a missing frame inferrer aiming at fixing missing frames. It current only handles missing frames due to the compiler tail call elimination (TCE) but could also be extended to supporting other scenarios like frame pointer omission. When a tail called function is sampled, the caller frame will be missing from the call chain because the caller frame is reused for the callee frame. While TCE is beneficial to both perf and reducing stack overflow, a workaround being made in this change aims to find back the missing frames as much as possible.

The idea behind this work is to build a dynamic call graph that consists of only tail call edges constructed from LBR samples and DFS-search for a unique path for a given source frame and target frame on the graph. The unique path will be used to fill in the missing frames between the source and target. Note that only a unique path counts. Multiple paths are treated unreachable since we don't want to overcount for any particular possible path.

A switch --infer-missing-frame is introduced and defaults to be on.

Some testing results:
- 0.4% perf win according to three internal benchmarks.
- About 2/3 of the missing tail call frames can be recovered, according to an internal benchmark.
- 10% more profile generation time.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D139367

More information about the All-commits mailing list