[PATCH] D111646: [AMDGPU] Enable load clustering in the post-RA scheduler
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 12 07:49:35 PDT 2021
foad created this revision.
foad added reviewers: arsenm, rampitec, Joe_Nash.
Herald added subscribers: kerbowa, asbirlea, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl.
foad requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.
This has a couple of benefits:
1. It can sometimes fix clusters that got broken apart when the register allocator inserted a copy.
2. Post-RA scheduling does not have to worry about increasing register pressure, which in some cases gives it more freedom to reorder instructions.
Testing on a collection of 10,000 graphics shaders compiled for gfx1010
showed:
- The average length of each run of one or more load instructions increased by about 1%.
- The number of runs of two or more load instructions increased by about 4%.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D111646
Files:
llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i128.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/udivrem.ll
llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll
llvm/test/CodeGen/AMDGPU/idiv-licm.ll
llvm/test/CodeGen/AMDGPU/promote-constOffset-to-imm.ll
llvm/test/CodeGen/AMDGPU/sdiv64.ll
llvm/test/CodeGen/AMDGPU/srem64.ll
llvm/test/CodeGen/AMDGPU/udiv64.ll
llvm/test/CodeGen/AMDGPU/urem64.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D111646.379036.patch
Type: text/x-patch
Size: 8377 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211012/7eb745aa/attachment.bin>
More information about the llvm-commits
mailing list