[PATCH] D136663: Handling ADD|SUB U64 decomposed Pseudos not getting lowered to SDWA form

Fri Nov 11 03:18:33 PST 2022

foad added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIPeepholeSDWA.cpp:904

-  MISucc.eraseFromParent();
+  MISucc.substituteRegister(CarryIn->getReg(), AMDGPU::VCC, 0, *TRI);
 }
----------------
yassingh wrote:
> foad wrote:
> > Doesn't this need to be VCC_LO for wave32? Please add a test for that. You can use SIRegisterInfo::getVCC() to get the appropriate reg for the wave size.
> Updated to TRI->getVCC(). However I am not able to add test for VCC_LO. Tried compiling for gfx1010, wavefrontsize=32 but SIInstrInfo::canShrink returns false for V_ADD_CO_U32_e64 hence the pass does not attempt converting it to sdwa form. 
> 
> Responsible condition in SIInstrInfo::canShrink() => // if (!hasVALU32BitEncoding(MI.getOpcode())) return false;//
Oh you're right, GFX10/11 V_ADD_CO_U32 does not have an e32 or sdwa form. Sorry.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136663/new/

https://reviews.llvm.org/D136663