[PATCH] D98587: [X86] Optimize vXi8 MULHS on targets where we can't sign_extend to the next register size.

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 18 05:27:55 PDT 2021


RKSimon added inline comments.


================
Comment at: llvm/test/CodeGen/X86/vec_smulo.ll:1803
 ; SSE41-NEXT:    pslld $31, %xmm8
 ; SSE41-NEXT:    psrad $31, %xmm8
+; SSE41-NEXT:    pshufd {{.*#+}} xmm2 = xmm4[2,3,2,3]
----------------
craig.topper wrote:
> RKSimon wrote:
> > Not related - but we should be able to remove this sext-in-reg code by replace the pmovzxbd with pmovzxsd (is that a hidden any_extend?)
> I'll check. Any idea why these tests use large result types like 16xi32 for a 16xi8 multiply?
Not sure - either we copied+pasted from somewhere else or maybe its legacy from before we cleaned up the types promotion/widening. I'll see if the fold would be used anywhere else.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98587/new/

https://reviews.llvm.org/D98587



More information about the llvm-commits mailing list