[all-commits] [llvm/llvm-project] 52a2d0: [AMDGPU] Improve PHI-breaking heuristics in CGP
Pierre van Houtryve via All-commits
all-commits at lists.llvm.org
Mon May 15 00:16:37 PDT 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 52a2d07bb3bb42594aab957b0da2e1e911abab59
https://github.com/llvm/llvm-project/commit/52a2d07bb3bb42594aab957b0da2e1e911abab59
Author: pvanhout <pierre.vanhoutryve at amd.com>
Date: 2023-05-15 (Mon, 15 May 2023)
Changed paths:
M llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
M llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-break-large-phis-heuristics.ll
Log Message:
-----------
[AMDGPU] Improve PHI-breaking heuristics in CGP
D147786 made the transform more conservative by adding heuristics,
which was a good idea. However, the transform got a bit
too conservative at times.
This caused a surprise in some rocRAND benchmarks because D143731 greatly helped a few of them.
For instance, a few xorwow-uniform tests saw a +30% boost in performance after that pass, which was lost when D147786 landed.
This patch is an attempt at reaching a middleground that makes
the pass a bit more permissive. It continues in the same spirit as
D147786 but does the following changes:
- PHI users of a PHI node are now recursively checked. When loops are encountered, we consider the PHIs non-breakable. (Considering them breakable had very negative effect in one app I tested)
- `shufflevector` is now considered interesting, given that it satisfies a few trivial checks.
Reviewed By: arsenm, #amdgpu, jmmartinez
Differential Revision: https://reviews.llvm.org/D150266
More information about the All-commits
mailing list