[PATCH] D143762: [AMDGPU] Enable whole wave register copy
Christudasan Devadasan via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 6 08:46:20 PDT 2023
cdevadas added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp:1356
+
+ addPass(&SILowerWWMCopiesID);
return true;
----------------
arsenm wrote:
> cdevadas wrote:
> > I'm still not convinced why this is needed in the -O0 flow?
> > By now, the VGPR allocation is done in the -O0 flow, and we no longer have any virtual registers. This pass act on virtual registers to see if wwm copies needed exec manipulation.
> It's conceptually needed and it's an implementation detail of current regalloc fast that these aren't introduced. Plus I think in general we should have other WWM copies for general WWM support in the future
ok
================
Comment at: llvm/lib/Target/AMDGPU/SILowerWWMCopies.cpp:111
+
+ if (!MFI->hasVRegFlags())
+ return false;
----------------
Also, do an early return if `MRI.getNumVirtRegs()` is zero. This would avoid iterating the whole function when there are no virtual registers at all (would take care of the -O0 path when physRegs are already assigned).
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D143762/new/
https://reviews.llvm.org/D143762
More information about the llvm-commits
mailing list