[PATCH] D80754: AMDGPU/GlobalISel: cmp/select method for insert element

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 3 13:11:28 PDT 2020


rampitec requested review of this revision.
rampitec marked an inline comment as done.
rampitec added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.ll:32
 ; MOVREL:       ; %bb.0: ; %entry
-; MOVREL-NEXT:    s_mov_b32 s0, s2
-; MOVREL-NEXT:    s_mov_b32 m0, s11
-; MOVREL-NEXT:    s_mov_b32 s1, s3
-; MOVREL-NEXT:    s_mov_b32 s2, s4
-; MOVREL-NEXT:    s_mov_b32 s3, s5
-; MOVREL-NEXT:    s_mov_b32 s4, s6
-; MOVREL-NEXT:    s_mov_b32 s5, s7
-; MOVREL-NEXT:    s_mov_b32 s6, s8
-; MOVREL-NEXT:    s_mov_b32 s7, s9
-; MOVREL-NEXT:    s_movreld_b32 s0, s10
+; MOVREL-NEXT:    v_cmp_eq_u32_e64 s0, s11, 0
+; MOVREL-NEXT:    v_cmp_eq_u32_e64 s1, s11, 1
----------------
That is an incorrect selection of G_ICMP with wave32. AMDGPUInstructionSelector::isVCC() does this:

```
  if (RC) {
    const LLT Ty = MRI.getType(Reg);
    return RC->hasSuperClassEq(TRI.getBoolRC()) &&
           Ty.isValid() && Ty.getSizeInBits() == 1;
  }
```

Since SGPR_32 is used for condition hasSuperClassEq() returns true, and even though we have Ty == s1 it still considers it a VCC.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80754/new/

https://reviews.llvm.org/D80754





More information about the llvm-commits mailing list