[PATCH] D143762: [AMDGPU] Enable whole wave register copy

Christudasan Devadasan via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jul 6 08:46:20 PDT 2023


cdevadas added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp:1356
+
+  addPass(&SILowerWWMCopiesID);
   return true;
----------------
arsenm wrote:
> cdevadas wrote:
> > I'm still not convinced why this is needed in the -O0 flow?
> > By now, the VGPR allocation is done in the -O0 flow, and we no longer have any virtual registers. This pass act on virtual registers to see if wwm copies needed exec manipulation.
> It's conceptually needed and it's an implementation detail of current regalloc fast that these aren't introduced. Plus I think in general we should have other WWM copies for general WWM support in the future
ok


================
Comment at: llvm/lib/Target/AMDGPU/SILowerWWMCopies.cpp:111
+
+  if (!MFI->hasVRegFlags())
+    return false;
----------------
Also, do an early return if `MRI.getNumVirtRegs()` is zero. This would avoid iterating the whole function when there are no virtual registers at all (would take care of the -O0 path when physRegs are already assigned).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143762/new/

https://reviews.llvm.org/D143762



More information about the llvm-commits mailing list