[all-commits] [llvm/llvm-project] 52a2d0: [AMDGPU] Improve PHI-breaking heuristics in CGP

Mon May 15 00:16:37 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 52a2d07bb3bb42594aab957b0da2e1e911abab59
      https://github.com/llvm/llvm-project/commit/52a2d07bb3bb42594aab957b0da2e1e911abab59
  Author: pvanhout <pierre.vanhoutryve at amd.com>
  Date:   2023-05-15 (Mon, 15 May 2023)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
    M llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-break-large-phis-heuristics.ll

  Log Message:
  -----------
  [AMDGPU] Improve PHI-breaking heuristics in CGP

D147786 made the transform more conservative by adding heuristics,
which was a good idea. However, the transform got a bit
too conservative at times.

This caused a surprise in some rocRAND benchmarks because D143731 greatly helped a few of them.
For instance, a few xorwow-uniform tests saw a +30% boost in performance after that pass, which was lost when D147786 landed.

This patch is an attempt at reaching a middleground that makes
the pass a bit more permissive. It continues in the same spirit as
D147786 but does the following changes:
- PHI users of a PHI node are now recursively checked. When loops are encountered, we consider the PHIs non-breakable. (Considering them breakable had very negative effect in one app I tested)
-  `shufflevector` is now considered interesting, given that it satisfies a few trivial checks.

Reviewed By: arsenm, #amdgpu, jmmartinez

Differential Revision: https://reviews.llvm.org/D150266