[PATCH] D98587: [X86] Optimize vXi8 MULHS on targets where we can't sign_extend to the next register size.

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 17 09:28:40 PDT 2021


RKSimon added inline comments.


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:27494
+  // and use pmullw to calculate the full 16-bit product.
+  // For signed we'll use punpcklbw/punpckbw to extend the bytes to words and
+  // shift them left into the upper byte of each word. This allows us to use
----------------
punpcklbw/punpckhbw


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:27495
+  // For signed we'll use punpcklbw/punpckbw to extend the bytes to words and
+  // shift them left into the upper byte of each word. This allows us to use
+  // pmulhw to calculate the full 16-bit product. This trick means we don't
----------------
Isn't the shift just for RHS constants?


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:27515
   SDValue BLo, BHi;
   if (ISD::isBuildVectorOfConstantSDNodes(B.getNode())) {
     // If the RHS is a constant, manually unpackl/unpackh and extend.
----------------
Do we need this constant path any more? It seemed to be to avoid problems with the SSE41 path that's been removed.


================
Comment at: llvm/test/CodeGen/X86/vec_smulo.ll:1803
 ; SSE41-NEXT:    pslld $31, %xmm8
 ; SSE41-NEXT:    psrad $31, %xmm8
+; SSE41-NEXT:    pshufd {{.*#+}} xmm2 = xmm4[2,3,2,3]
----------------
Not related - but we should be able to remove this sext-in-reg code by replace the pmovzxbd with pmovzxsd (is that a hidden any_extend?)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98587/new/

https://reviews.llvm.org/D98587



More information about the llvm-commits mailing list