[PATCH] D136663: Handling ADD|SUB U64 decomposed Pseudos not getting lowered to SDWA form
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 11 03:18:33 PST 2022
foad added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIPeepholeSDWA.cpp:904
- MISucc.eraseFromParent();
+ MISucc.substituteRegister(CarryIn->getReg(), AMDGPU::VCC, 0, *TRI);
}
----------------
yassingh wrote:
> foad wrote:
> > Doesn't this need to be VCC_LO for wave32? Please add a test for that. You can use SIRegisterInfo::getVCC() to get the appropriate reg for the wave size.
> Updated to TRI->getVCC(). However I am not able to add test for VCC_LO. Tried compiling for gfx1010, wavefrontsize=32 but SIInstrInfo::canShrink returns false for V_ADD_CO_U32_e64 hence the pass does not attempt converting it to sdwa form.
>
> Responsible condition in SIInstrInfo::canShrink() => // if (!hasVALU32BitEncoding(MI.getOpcode())) return false;//
Oh you're right, GFX10/11 V_ADD_CO_U32 does not have an e32 or sdwa form. Sorry.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D136663/new/
https://reviews.llvm.org/D136663
More information about the llvm-commits
mailing list