[PATCH] D74872: AMDGPU/GlobalISel: Avoid illegal vector exts for add/sub/mul

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 20 09:43:18 PST 2020


arsenm marked 3 inline comments as done.
arsenm added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/add.v2i16.ll:95
+; GFX9-NEXT:    s_mov_b32 s4, 0xffffffc0
+; GFX9-NEXT:    s_pack_ll_b32_b16 s4, s4, s4
+; GFX9-NEXT:    v_pk_add_u16 v0, v0, s4
----------------
foad wrote:
> Shouldn't this get constant folded to s_mov_b32 s4, 0xffc0ffc0 ? Or even folded into the v_pk_add_u16 if you can do that?
We don't have any constant folding or really anything needed to get good vector code


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/add.v2i16.ll:170
+; GFX8-NEXT:    s_and_b32 s0, s0, s3
+; GFX8-NEXT:    s_and_b32 s2, s2, s3
+; GFX8-NEXT:    s_add_i32 s0, s0, s1
----------------
foad wrote:
> This s_and is redundant. Since we're going to add something to it and then shift it left by 16 the high order bits will be lost anyway.
We don't have any optimizations, and don't run anything after RegBankSelect yet. Eventually a combiner pass is needed to cleanup here


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/add.v2i16.ll:253-256
+; GFX8-NEXT:    s_and_b32 s0, s0, s3
+; GFX8-NEXT:    s_and_b32 s1, s1, s3
+; GFX8-NEXT:    s_and_b32 s2, s2, s3
+; GFX8-NEXT:    s_and_b32 s4, s4, s3
----------------
foad wrote:
> All four of these are redundant.
Same, we don't try to clean anything up yet


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74872/new/

https://reviews.llvm.org/D74872





More information about the llvm-commits mailing list