[PATCH] D114252: [AMDGPU] Only select VOP3 forms of VOP2 instructions
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 19 08:56:22 PST 2021
foad added reviewers: arsenm, rampitec, alex-t.
foad added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/ctpop16.ll:776
; VI-NEXT: v_lshlrev_b32_e32 v0, 5, v0
-; VI-NEXT: v_mov_b32_e32 v8, 0xffff
; VI-NEXT: s_mov_b32 s7, 0xf000
----------------
Small win here.
================
Comment at: llvm/test/CodeGen/AMDGPU/flat-scratch.ll:505
+; GFX9-NEXT: v_mov_b32_e32 v3, 15
+; GFX9-NEXT: v_and_b32_e32 v0, 15, v0
; GFX9-NEXT: scratch_store_dword v2, v3, off
----------------
The 15 has been folded here, which I think is good, even though it didn't save any instructions or registers.
================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wqm.demote.ll:164
; SI-NEXT: v_or_b32_e32 v0, v0, v1
-; SI-NEXT: v_and_b32_e32 v1, 1, v0
; SI-NEXT: v_and_b32_e32 v0, 1, v0
----------------
Small win here.
================
Comment at: llvm/test/CodeGen/AMDGPU/sdwa-peephole.ll:576
;
-; NOSDWA: s_mov_b32 [[CONST:s[0-9]+]], 0x10000
-; NOSDWA: v_or_b32_e32 v{{[0-9]+}}, s0,
+; NOSDWA: v_or_b32_e32 v{{[0-9]+}}, 0x10000,
; SDWA: v_or_b32_e32 v{{[0-9]+}}, 0x10000,
----------------
More folding here.
================
Comment at: llvm/test/CodeGen/AMDGPU/ssubsat.ll:616
; GFX6-NEXT: v_sub_i32_e64 v8, s[4:5], v3, v11
-; GFX6-NEXT: v_bfrev_b32_e32 v16, 1
; GFX6-NEXT: v_cmp_lt_i32_e32 vcc, 0, v11
----------------
Small win here.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D114252/new/
https://reviews.llvm.org/D114252
More information about the llvm-commits
mailing list