[PATCH] D119696: [AMDGPU] Improve v_cmpx usage on GFX10.3.
Sebastian Neubauer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 14 02:44:39 PST 2022
sebastian-ne added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/wqm.ll:385-386
; GFX10-W32-NEXT: v_mbcnt_hi_u32_b32 v0, -1, v0
-; GFX10-W32-NEXT: v_cmp_gt_u32_e32 vcc_lo, 16, v0
+; GFX10-W32-NEXT: v_cmpx_gt_u32_e64 16, v0
; GFX10-W32-NEXT: v_mov_b32_e32 v0, 0
; GFX10-W32-NEXT: s_cbranch_execz .LBB9_2
----------------
I think the v_cmpx instruction should be inserted the the place of the s_and_saveexec, not at the place of the v_cmp. Otherwise this v_mov gets executed with the wrong exec mask.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D119696/new/
https://reviews.llvm.org/D119696
More information about the llvm-commits
mailing list