[PATCH] D114252: [AMDGPU] Only select VOP3 forms of VOP2 instructions

Fri Nov 19 08:56:22 PST 2021

foad added reviewers: arsenm, rampitec, alex-t.
foad added inline comments.

================
Comment at: llvm/test/CodeGen/AMDGPU/ctpop16.ll:776
 ; VI-NEXT:    v_lshlrev_b32_e32 v0, 5, v0
-; VI-NEXT:    v_mov_b32_e32 v8, 0xffff
 ; VI-NEXT:    s_mov_b32 s7, 0xf000
----------------
Small win here.

================
Comment at: llvm/test/CodeGen/AMDGPU/flat-scratch.ll:505
+; GFX9-NEXT:    v_mov_b32_e32 v3, 15
+; GFX9-NEXT:    v_and_b32_e32 v0, 15, v0
 ; GFX9-NEXT:    scratch_store_dword v2, v3, off
----------------
The 15 has been folded here, which I think is good, even though it didn't save any instructions or registers.

================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wqm.demote.ll:164
 ; SI-NEXT:    v_or_b32_e32 v0, v0, v1
-; SI-NEXT:    v_and_b32_e32 v1, 1, v0
 ; SI-NEXT:    v_and_b32_e32 v0, 1, v0
----------------
Small win here.

================
Comment at: llvm/test/CodeGen/AMDGPU/sdwa-peephole.ll:576
 ;
-; NOSDWA: s_mov_b32 [[CONST:s[0-9]+]], 0x10000
-; NOSDWA: v_or_b32_e32 v{{[0-9]+}}, s0,
+; NOSDWA: v_or_b32_e32 v{{[0-9]+}}, 0x10000,
 ; SDWA: v_or_b32_e32 v{{[0-9]+}}, 0x10000,
----------------
More folding here.

================
Comment at: llvm/test/CodeGen/AMDGPU/ssubsat.ll:616
 ; GFX6-NEXT:    v_sub_i32_e64 v8, s[4:5], v3, v11
-; GFX6-NEXT:    v_bfrev_b32_e32 v16, 1
 ; GFX6-NEXT:    v_cmp_lt_i32_e32 vcc, 0, v11
----------------
Small win here.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114252/new/

https://reviews.llvm.org/D114252