[PATCH] D156414: [AMDGPU] Break Large PHIs: Take whole PHI chains into account
Pierre van Houtryve via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 27 03:16:01 PDT 2023
Pierre-vh created this revision.
Pierre-vh added a reviewer: arsenm.
Herald added subscribers: foad, kerbowa, hiraditya, tpr, dstuttard, yaxunl, jvesely, kzhuravl.
Herald added a project: All.
Pierre-vh requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.
Previous heuristics had a big flaw: they only looked at single PHI at a time, and didn't take into account the whole "chain".
The concept of "chain" is important because if we only break a chain partially, we risk forcing regalloc to reserve twice as many registers for that vector.
We also risk adding a lot of copies that shouldn't be there and can inhibit backend optimizations.
The solution I found is to consider the whole "PHI chain" when looking at PHI.
That is, we recursively look at the PHI's incoming value & users for other PHIs, then make a decision about the chain as a whole.
The currrent threshold requires that at least `ceil(chain size * (2/3))` PHIs have at least one interesting incoming value.
In simple terms, two-thirds (rounded up) of the PHIs should be breakable.
This seems to work well. A lower threshold such as 50% is too aggressive because chains can often have 7 or 9 PHIs, and breaking 3+ or 4+ PHIs in those case often causes performance issue.
Fixes SWDEV-409648, SWDEV-398393, SWDEV-413487
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D156414
Files:
llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-break-large-phis-heuristics.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D156414.544683.patch
Type: text/x-patch
Size: 52247 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230727/bdea5c02/attachment.bin>
More information about the llvm-commits
mailing list