[PATCH] D136663: Handling ADD|SUB U64 decomposed Pseudos not getting lowered to SDWA form

Thu Nov 10 21:59:40 PST 2022

yassingh added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIPeepholeSDWA.cpp:904

-  MISucc.eraseFromParent();
+  MISucc.substituteRegister(CarryIn->getReg(), AMDGPU::VCC, 0, *TRI);
 }
----------------
foad wrote:
> Doesn't this need to be VCC_LO for wave32? Please add a test for that. You can use SIRegisterInfo::getVCC() to get the appropriate reg for the wave size.
Updated to TRI->getVCC(). However I am not able to add test for VCC_LO. Tried compiling for gfx1010, wavefrontsize=32 but SIInstrInfo::canShrink returns false for V_ADD_CO_U32_e64 hence the pass does not attempt converting it to sdwa form. 

Responsible condition in SIInstrInfo::canShrink() => // if (!hasVALU32BitEncoding(MI.getOpcode())) return false;//

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136663/new/

https://reviews.llvm.org/D136663