[PATCH] D74872: AMDGPU/GlobalISel: Avoid illegal vector exts for add/sub/mul
    Jay Foad via Phabricator via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Thu Feb 20 09:34:28 PST 2020
    
    
  
foad accepted this revision.
foad added a comment.
This revision is now accepted and ready to land.
LGTM. Some possible improvements noted inline.
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/add.v2i16.ll:95
+; GFX9-NEXT:    s_mov_b32 s4, 0xffffffc0
+; GFX9-NEXT:    s_pack_ll_b32_b16 s4, s4, s4
+; GFX9-NEXT:    v_pk_add_u16 v0, v0, s4
----------------
Shouldn't this get constant folded to s_mov_b32 s4, 0xffc0ffc0 ? Or even folded into the v_pk_add_u16 if you can do that?
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/add.v2i16.ll:170
+; GFX8-NEXT:    s_and_b32 s0, s0, s3
+; GFX8-NEXT:    s_and_b32 s2, s2, s3
+; GFX8-NEXT:    s_add_i32 s0, s0, s1
----------------
This s_and is redundant. Since we're going to add something to it and then shift it left by 16 the high order bits will be lost anyway.
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/add.v2i16.ll:253-256
+; GFX8-NEXT:    s_and_b32 s0, s0, s3
+; GFX8-NEXT:    s_and_b32 s1, s1, s3
+; GFX8-NEXT:    s_and_b32 s2, s2, s3
+; GFX8-NEXT:    s_and_b32 s4, s4, s3
----------------
All four of these are redundant.
CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74872/new/
https://reviews.llvm.org/D74872
    
    
More information about the llvm-commits
mailing list