[all-commits] [llvm/llvm-project] 0a0800: A post-processing for BFI inference

WenleiHe via All-commits all-commits at lists.llvm.org
Fri Jun 11 21:52:32 PDT 2021


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 0a0800c4d10c250ffb152b5f059d6f9a19ed8efe
      https://github.com/llvm/llvm-project/commit/0a0800c4d10c250ffb152b5f059d6f9a19ed8efe
  Author: spupyrev <spupyrev at fb.com>
  Date:   2021-06-11 (Fri, 11 Jun 2021)

  Changed paths:
    M llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
    M llvm/lib/Analysis/BlockFrequencyInfoImpl.cpp
    A llvm/test/Transforms/SampleProfile/Inputs/profile-correlation-irreducible-loops.prof
    A llvm/test/Transforms/SampleProfile/profile-correlation-irreducible-loops.ll

  Log Message:
  -----------
  A post-processing for BFI inference

The current implementation for computing relative block frequencies does
not handle correctly control-flow graphs containing irreducible loops. This
results in suboptimally generated binaries, whose perf can be up to 5%
worse than optimal.

To resolve the problem, we apply a post-processing step, which iteratively
updates block frequencies based on the frequencies of their predesessors.
This corresponds to finding the stationary point of the Markov chain by
an iterative method aka "PageRank computation". The algorithm takes at
most O(|E| * IterativeBFIMaxIterations) steps but typically converges faster.

It is turned on by passing option `use-iterative-bfi-inference`
and applied only for functions containing profile data and irreducible loops.

Tested on SPEC06/17, where it is helping to get correct profile counts for one of
the binaries (403.gcc). In prod binaries, we've seen a speedup of up to 2%-5%
for binaries containing functions with hot irreducible loops.

Reviewed By: hoy, wenlei, davidxl

Differential Revision: https://reviews.llvm.org/D103289




More information about the All-commits mailing list