[PATCH] D136663: Handling ADD|SUB U64 decomposed Pseudos not getting lowered to SDWA form
Yashwant Singh via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 26 09:58:58 PDT 2022
yassingh added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/v_add_u64_pseudo_sdwa.ll:9-10
+; GFX9-NEXT: v_add_co_u32_sdwa v0, vcc, v1, v0 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:BYTE_0
+; GFX9-NEXT: v_mov_b32_e32 v1, 0
+; GFX9-NEXT: v_addc_co_u32_e32 v1, vcc, 0, v1, vcc
+; GFX9-NEXT: global_store_dwordx2 v[0:1], v[0:1], off
----------------
foad wrote:
> This is silly. You have added a mov instruction to shrink the addc instruction for no reason, because it is not actually converted to sdwa form.
Yes you are right but the pass right now only attempts to convert to sdwa when both instructions (V_ADD_CO and V_ADDC_CO) are shrinkable to their sdwa formed so I went this way.
I can explore only shrinking the v_add_co instruction and avoid inserting that mov instruction?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D136663/new/
https://reviews.llvm.org/D136663
More information about the llvm-commits
mailing list