[PATCH] D20931: [X86] Reduce the width of multiplification when its operands are extended from i8 or i16

Tue Jun 7 23:26:50 PDT 2016

wmi added inline comments.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26506
@@ +26505,3 @@
+        bool IsNegative = CN->getAPIntValue().isNegative();
+        SbitsNum = DAG.ComputeNumSignBits(SubOp);
+        if (SbitsNum < 25)
----------------
eli.friedman wrote:
> It's probably more clear to express this in terms of the number of sign bits and the number of leading zeros (APInt::countLeadingZeros).  Actually, you could probably get rid of the ValRange enumeration altogether in favor of those two numbers.  For example, return `MULS8` if `std::min(signbits1, signbits2) > 24`.
I rewrite it and the code is much shorter. Thanks for the suggestion.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26675
@@ +26674,3 @@
+      return DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, VT, Res,
+                         DAG.getIntPtrConstant(0, DL));
+    } else {
----------------
eli.friedman wrote:
> I'm not following... how do you get to <4 x i64>?  Legalization of a `<4 x i16>` multiply will widen it to an `<8 x i16>` multiply; this codepath already gets used for IR like `mul <4 x i16> %a, %b`.
Sorry. I wanted to say <4 x i32> instead of <4 x i64>.  when legalizing <4 x i16> to <4 x i32>, it will use a punpcklwd instruction if the input is load <4 x i16>. It is different from widening <4 x i16> to <8 x i16> by filling undef in the higher bits.

Repository:
  rL LLVM

http://reviews.llvm.org/D20931