[PATCH] D136663: Handling ADD|SUB U64 decomposed Pseudos not getting lowered to SDWA form

Yashwant Singh via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 10 21:59:40 PST 2022


yassingh added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIPeepholeSDWA.cpp:904
 
-  MISucc.eraseFromParent();
+  MISucc.substituteRegister(CarryIn->getReg(), AMDGPU::VCC, 0, *TRI);
 }
----------------
foad wrote:
> Doesn't this need to be VCC_LO for wave32? Please add a test for that. You can use SIRegisterInfo::getVCC() to get the appropriate reg for the wave size.
Updated to TRI->getVCC(). However I am not able to add test for VCC_LO. Tried compiling for gfx1010, wavefrontsize=32 but SIInstrInfo::canShrink returns false for V_ADD_CO_U32_e64 hence the pass does not attempt converting it to sdwa form. 

Responsible condition in SIInstrInfo::canShrink() => // if (!hasVALU32BitEncoding(MI.getOpcode())) return false;//


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136663/new/

https://reviews.llvm.org/D136663



More information about the llvm-commits mailing list