[PATCH] D74872: AMDGPU/GlobalISel: Avoid illegal vector exts for add/sub/mul

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 20 09:34:28 PST 2020


foad accepted this revision.
foad added a comment.
This revision is now accepted and ready to land.

LGTM. Some possible improvements noted inline.



================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/add.v2i16.ll:95
+; GFX9-NEXT:    s_mov_b32 s4, 0xffffffc0
+; GFX9-NEXT:    s_pack_ll_b32_b16 s4, s4, s4
+; GFX9-NEXT:    v_pk_add_u16 v0, v0, s4
----------------
Shouldn't this get constant folded to s_mov_b32 s4, 0xffc0ffc0 ? Or even folded into the v_pk_add_u16 if you can do that?


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/add.v2i16.ll:170
+; GFX8-NEXT:    s_and_b32 s0, s0, s3
+; GFX8-NEXT:    s_and_b32 s2, s2, s3
+; GFX8-NEXT:    s_add_i32 s0, s0, s1
----------------
This s_and is redundant. Since we're going to add something to it and then shift it left by 16 the high order bits will be lost anyway.


================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/add.v2i16.ll:253-256
+; GFX8-NEXT:    s_and_b32 s0, s0, s3
+; GFX8-NEXT:    s_and_b32 s1, s1, s3
+; GFX8-NEXT:    s_and_b32 s2, s2, s3
+; GFX8-NEXT:    s_and_b32 s4, s4, s3
----------------
All four of these are redundant.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74872/new/

https://reviews.llvm.org/D74872





More information about the llvm-commits mailing list