[all-commits] [llvm/llvm-project] 418958: [InstCombine] Only perform one iteration

Mon Jul 31 01:57:08 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 41895843b5915bb78e9d02aa711fa10f7174db43
      https://github.com/llvm/llvm-project/commit/41895843b5915bb78e9d02aa711fa10f7174db43
  Author: Nikita Popov <npopov at redhat.com>
  Date:   2023-07-31 (Mon, 31 Jul 2023)

  Changed paths:
    M llvm/include/llvm/Transforms/InstCombine/InstCombine.h
    M llvm/lib/Passes/PassBuilder.cpp
    M llvm/lib/Passes/PassRegistry.def
    M llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
    M llvm/test/Analysis/ValueTracking/numsignbits-from-assume.ll
    M llvm/test/Other/new-pm-print-pipeline.ll
    M llvm/test/Transforms/InstCombine/constant-fold-iteration.ll
    M llvm/test/Transforms/InstCombine/merging-multiple-stores-into-successor.ll
    M llvm/test/Transforms/InstCombine/pr55228.ll
    M llvm/test/Transforms/InstCombine/shift.ll
    M llvm/test/Transforms/PGOProfile/chr.ll
    M llvm/test/Transforms/PhaseOrdering/AArch64/matrix-extract-insert.ll

  Log Message:
  -----------
  [InstCombine] Only perform one iteration

InstCombine is a worklist-driven algorithm, which works roughly
as follows:

* All instructions are initially pushed to the worklist.
  The initial order is in RPO program order.
* All newly inserted instructions get added to the worklist.
* When an instruction is folded, its users get added back to the
  worklist.
* When the use-count of an instruction decreases, it gets added
  back to the worklist.
* And a few of other heuristics on when we should revisit
  instructions.

On top of the worklist algorithm, InstCombine layers an additional
fix-point iteration: If any fold was performed in the previous
iteration, then InstCombine will re-populate the worklist from
scratch and fold the entire function again. This continues until
a fix-point is reached.

In the vast majority of cases, InstCombine will reach a fix-point
within a single iteration: However, a second iteration is performed
to verify that this is indeed the fixpoint. We can see this in the
statistics for llvm-test-suite:

    "instcombine.NumOneIteration": 411380,
    "instcombine.NumTwoIterations": 117921,
    "instcombine.NumThreeIterations": 236,
    "instcombine.NumFourOrMoreIterations": 2,

The way to read these numbers is that in 411380 cases, InstCombine
performs no folds. In 117921 cases it performs a fold and reaches
the fix-point within one iteration (the second iteration verifies
the fixpoint). In the remaining 238 cases, more than one iteration
is needed to reach the fixpoint.

In other words, only in 0.04% of cases are additional iterations
needed to reach a fixpoint. Conversely, in 22.3% of cases InstCombine
performs a completely useless extra iteration to verify the fix point.

This patch removes the fixpoint iteration from InstCombine, and always
only perform a single iteration. This results in a major compile-time
improvement of around 4% at negligible codegen impact.

This explicitly does accept that we will not reach a fixpoint in all
cases. However, this is mitigated by two factors: First, the data
suggests that this happens very rarely in practice. Second,
InstCombine runs many times during the optimization pipeline
(8 times even without LTO), so there are many chances to recover
such cases.

In order to prevent accidental optimization regressions in the
future, this implements a verify-fixpoint option, which is enabled
by default when instcombine is specified in -passes and disabled
when InstCombinePass() is constructed from C++. This means that
test cases need to explicitly use the no-verify-fixpoint option
if they fail to reach a fixed point (for a well understand reason
we cannot / do not want to avoid).

Differential Revision: https://reviews.llvm.org/D154579