[PATCH] D80754: AMDGPU/GlobalISel: cmp/select method for insert element
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu May 28 13:45:53 PDT 2020
arsenm added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.ll:331
+; GPRIDX-NEXT: v_mov_b32_e32 v1, s2
+; GPRIDX-NEXT: s_cmp_eq_u32 s10, 2
+; GPRIDX-NEXT: v_cndmask_b32_e32 v8, v1, v0, vcc
----------------
rampitec wrote:
> I assume this will get better when it is moved after RegBankSelect. The issue is copies to VCC inserted by RegBankSelect. The code does not need to be like this, the same code in SelectionDAG creates here:
>
>
> ```
> ; %bb.0: ; %entry
> v_mov_b32_e32 v1, s0
> v_cmp_eq_u32_e64 vcc, s8, 0
> v_cndmask_b32_e32 v8, v1, v0, vcc
> v_mov_b32_e32 v1, s1
> v_cmp_eq_u32_e64 vcc, s8, 1
> v_cndmask_b32_e32 v1, v1, v0, vcc
> v_mov_b32_e32 v2, s2
> v_cmp_eq_u32_e64 vcc, s8, 2
> v_cndmask_b32_e32 v2, v2, v0, vcc
> v_mov_b32_e32 v3, s3
> v_cmp_eq_u32_e64 vcc, s8, 3
> v_cndmask_b32_e32 v3, v3, v0, vcc
> v_mov_b32_e32 v4, s4
> v_cmp_eq_u32_e64 vcc, s8, 4
> v_cndmask_b32_e32 v4, v4, v0, vcc
> v_mov_b32_e32 v5, s5
> v_cmp_eq_u32_e64 vcc, s8, 5
> v_cndmask_b32_e32 v5, v5, v0, vcc
> v_mov_b32_e32 v6, s6
> v_cmp_eq_u32_e64 vcc, s8, 6
> v_cndmask_b32_e32 v6, v6, v0, vcc
> v_mov_b32_e32 v7, s7
> v_cmp_eq_u32_e64 vcc, s8, 7
> v_cndmask_b32_e32 v7, v7, v0, vcc
> v_mov_b32_e32 v0, v8
> ; return to shader part epilog
> ```
Boolean handling is a mess that needs cleanups, and now we get none. I recently saw a case using 5 instructions to get a constant 0.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D80754/new/
https://reviews.llvm.org/D80754
More information about the llvm-commits
mailing list