[PATCH] D80754: AMDGPU/GlobalISel: cmp/select method for insert element

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 3 14:20:47 PDT 2020


rampitec marked an inline comment as done.
rampitec added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.ll:32
 ; MOVREL:       ; %bb.0: ; %entry
-; MOVREL-NEXT:    s_mov_b32 s0, s2
-; MOVREL-NEXT:    s_mov_b32 m0, s11
-; MOVREL-NEXT:    s_mov_b32 s1, s3
-; MOVREL-NEXT:    s_mov_b32 s2, s4
-; MOVREL-NEXT:    s_mov_b32 s3, s5
-; MOVREL-NEXT:    s_mov_b32 s4, s6
-; MOVREL-NEXT:    s_mov_b32 s5, s7
-; MOVREL-NEXT:    s_mov_b32 s6, s8
-; MOVREL-NEXT:    s_mov_b32 s7, s9
-; MOVREL-NEXT:    s_movreld_b32 s0, s10
+; MOVREL-NEXT:    v_cmp_eq_u32_e64 s0, s11, 0
+; MOVREL-NEXT:    v_cmp_eq_u32_e64 s1, s11, 1
----------------
rampitec wrote:
> That is an incorrect selection of G_ICMP with wave32. AMDGPUInstructionSelector::isVCC() does this:
> 
> ```
>   if (RC) {
>     const LLT Ty = MRI.getType(Reg);
>     return RC->hasSuperClassEq(TRI.getBoolRC()) &&
>            Ty.isValid() && Ty.getSizeInBits() == 1;
>   }
> ```
> 
> Since SGPR_32 is used for condition hasSuperClassEq() returns true, and even though we have Ty == s1 it still considers it a VCC.
This can be fixed by using LLT::scalar(32) instead of LLT::scalar(1).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80754/new/

https://reviews.llvm.org/D80754





More information about the llvm-commits mailing list