[PATCH] D74872: AMDGPU/GlobalISel: Avoid illegal vector exts for add/sub/mul
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 20 09:34:28 PST 2020
foad accepted this revision.
foad added a comment.
This revision is now accepted and ready to land.
LGTM. Some possible improvements noted inline.
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/add.v2i16.ll:95
+; GFX9-NEXT: s_mov_b32 s4, 0xffffffc0
+; GFX9-NEXT: s_pack_ll_b32_b16 s4, s4, s4
+; GFX9-NEXT: v_pk_add_u16 v0, v0, s4
----------------
Shouldn't this get constant folded to s_mov_b32 s4, 0xffc0ffc0 ? Or even folded into the v_pk_add_u16 if you can do that?
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/add.v2i16.ll:170
+; GFX8-NEXT: s_and_b32 s0, s0, s3
+; GFX8-NEXT: s_and_b32 s2, s2, s3
+; GFX8-NEXT: s_add_i32 s0, s0, s1
----------------
This s_and is redundant. Since we're going to add something to it and then shift it left by 16 the high order bits will be lost anyway.
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/add.v2i16.ll:253-256
+; GFX8-NEXT: s_and_b32 s0, s0, s3
+; GFX8-NEXT: s_and_b32 s1, s1, s3
+; GFX8-NEXT: s_and_b32 s2, s2, s3
+; GFX8-NEXT: s_and_b32 s4, s4, s3
----------------
All four of these are redundant.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D74872/new/
https://reviews.llvm.org/D74872
More information about the llvm-commits
mailing list