[PATCH] D136663: Handling ADD|SUB U64 decomposed Pseudos not getting lowered to SDWA form
Yashwant Singh via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 10 21:59:40 PST 2022
yassingh added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIPeepholeSDWA.cpp:904
- MISucc.eraseFromParent();
+ MISucc.substituteRegister(CarryIn->getReg(), AMDGPU::VCC, 0, *TRI);
}
----------------
foad wrote:
> Doesn't this need to be VCC_LO for wave32? Please add a test for that. You can use SIRegisterInfo::getVCC() to get the appropriate reg for the wave size.
Updated to TRI->getVCC(). However I am not able to add test for VCC_LO. Tried compiling for gfx1010, wavefrontsize=32 but SIInstrInfo::canShrink returns false for V_ADD_CO_U32_e64 hence the pass does not attempt converting it to sdwa form.
Responsible condition in SIInstrInfo::canShrink() => // if (!hasVALU32BitEncoding(MI.getOpcode())) return false;//
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D136663/new/
https://reviews.llvm.org/D136663
More information about the llvm-commits
mailing list