[PATCH] D20931: [X86] Reduce the width of multiplification when its operands are extended from i8 or i16

Mon Jun 6 11:09:06 PDT 2016

eli.friedman added a comment.

This is looking a lot better overall.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26537
@@ +26536,3 @@
+    }
+  }
+
----------------
It feels like you should be able to use ComputeNumSignBits/computeKnownBits here; I'm not sure how much shorter that actually ends up, though.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26675
@@ +26674,3 @@
+      return DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, VT, Res,
+                         DAG.getIntPtrConstant(0, DL));
+    } else {
----------------
It's not obvious to me why you're explicitly legalizing this here; you could just generate a MUL on, for example, <4 x i16> and legalization should do the right thing from there.

================
Comment at: test/CodeGen/X86/shrink_vmul.ll:751
@@ +750,3 @@
+  %ins1 = insertelement <2 x i32> %ins0, i32 32767, i32 1
+  %tmp13 = mul nuw nsw <2 x i32> %tmp8, %ins1
+  %tmp14 = getelementptr inbounds i32, i32* %pre, i64 %index
----------------
It would probably be more clear to write this as `mul nuw nsw <2 x i32> %tmp8, <i32 -32768, i32 32767>`.

Repository:
  rL LLVM

http://reviews.llvm.org/D20931