[all-commits] [llvm/llvm-project] 8df759: [CodeGen] Fine tune MachineFunctionSplitPass (MFS)...

Mon Jul 10 16:02:36 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 8df75969ae7023b6ee527f9f9e0c8aaca02eee86
      https://github.com/llvm/llvm-project/commit/8df75969ae7023b6ee527f9f9e0c8aaca02eee86
  Author: Han Shen <shenhan at google.com>
  Date:   2023-07-10 (Mon, 10 Jul 2023)

  Changed paths:
    M llvm/lib/CodeGen/MachineFunctionSplitter.cpp
    M llvm/test/CodeGen/X86/machine-function-splitter.ll

  Log Message:
  -----------
  [CodeGen] Fine tune MachineFunctionSplitPass (MFS) for FSAFDO.

The original MFS work D85368 shows good performance improvement with
Instrumented FDO. However, AutoFDO or Flow-Sensitive AutoFDO (FSAFDO)
does not show performance gain. This is mainly caused by a less
accurate profile compared to the iFDO profile.

For the past few months, we have been working to improve FSAFDO
quality, like in D145171. Taking advantage of this improvement, MFS
now shows performance improvements over FSAFDO profiles.

That being said, 2 minor changes need to be made, 1) An FS-AutoFDO
profile generation pass needs to be added right before MFS pass and an
FSAFDO profile load pass is needed when FS-AutoFDO is enabled and the
MFS flag is present. 2) MFS only applies to hot functions, because we
believe (and experiment also shows) FS-AutoFDO is more accurate about
functions that have plenty of samples than those with no or very few
samples.

With this improvement, we see a 1.2% performance improvement in clang
benchmark, 0.9% QPS improvement in our internal search benchmark, and
3%-5% improvement in internal storage benchmark.

This is #1 of the two patches that enables the improvement.

Reviewed By: wenlei, snehasish, xur

Differential Revision: https://reviews.llvm.org/D152399