[PATCH] D143762: [AMDGPU] Enable whole wave register copy
Yashwant Singh via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jun 23 07:52:58 PDT 2023
yassingh added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SILowerPredicatedCopies.cpp:159
+ LLVM_DEBUG(dbgs() << "Updated " << MI << " to use COPY opcode");
+ MI.setDesc(TII->get(AMDGPU::COPY));
+ Changed |= true;
----------------
arsenm wrote:
> arsenm wrote:
> > arsenm wrote:
> > > yassingh wrote:
> > > > arsenm wrote:
> > > > > Do these need to gain an implicit exec use?
> > > > Do you mean we should add the implicit exec? In that case, SIFixVGPRCopies will take care?
> > > This isn't something that can be taken care of later. SIFixVGPRCopies is a horribly broken hack, the less we depend on it the better
> > Actually these split copies might have been the original problem which caused it to be added. Maybe we have a way to drop it now?
> That was the reasoning given in D28874 when it was added. How about as a next step we make sure all the VGPR splits end up with exec reads
Will look into it, don't have sufficient expertise to answer this right now. Adding @AMDGPU.
================
Comment at: llvm/lib/Target/AMDGPU/SILowerPredicatedCopies.cpp:159
+ LLVM_DEBUG(dbgs() << "Updated " << MI << " to use COPY opcode");
+ MI.setDesc(TII->get(AMDGPU::COPY));
+ Changed |= true;
----------------
yassingh wrote:
> arsenm wrote:
> > arsenm wrote:
> > > arsenm wrote:
> > > > yassingh wrote:
> > > > > arsenm wrote:
> > > > > > Do these need to gain an implicit exec use?
> > > > > Do you mean we should add the implicit exec? In that case, SIFixVGPRCopies will take care?
> > > > This isn't something that can be taken care of later. SIFixVGPRCopies is a horribly broken hack, the less we depend on it the better
> > > Actually these split copies might have been the original problem which caused it to be added. Maybe we have a way to drop it now?
> > That was the reasoning given in D28874 when it was added. How about as a next step we make sure all the VGPR splits end up with exec reads
> Will look into it, don't have sufficient expertise to answer this right now. Adding @AMDGPU.
I can try and see what comes. I can see 2 ways to go about it, add the exec operand here itself or, in tablegen definition of PRED_COPY. What do you suggest?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D143762/new/
https://reviews.llvm.org/D143762
More information about the llvm-commits
mailing list