[PATCH] D20931: [X86] Reduce the width of multiplification when its operands are extended from i8 or i16
Michael Kuperstein via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 2 15:13:17 PDT 2016
mkuper added inline comments.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26452
@@ +26451,3 @@
+/// plus mulhs/mulhw <N x i16>.
+static SDValue reduceVMULWidth(SDNode *N, SelectionDAG &DAG,
+ const X86Subtarget &Subtarget) {
----------------
Any chance to split this function up? Or is there no logical way to do that?
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26468
@@ +26467,3 @@
+ enum {
+ DONTCARE,
+ SIZEI8,
----------------
I'm not sure DONTCARE is a good name for this - it's not a "top" value. Maybe UNKNOWN?
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26475
@@ +26474,3 @@
+ // Find out the smaller size of N0 before it is sign/zero-extended.
+ if (N0.getOpcode() == ISD::ZERO_EXTEND ||
+ N0.getOpcode() == ISD::SIGN_EXTEND) {
----------------
ISD::ANY_EXTEND would work too, right?
(Not sure how to test this, but if both ZERO and SIGN work, ANY should definitely work.)
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26481
@@ +26480,3 @@
+ else if (N0.getOperand(0).getValueType().getVectorElementType() == MVT::i16)
+ N0Size = SIZEI16;
+ } else {
----------------
What if it's MVT::i1?
You may want an "else return SDValue()" here as well.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26486
@@ +26485,3 @@
+
+ // Find out the smaller size of N1 before it is sign/zero-extended.
+ if (N1.getOpcode() == ISD::ZERO_EXTEND ||
----------------
Same two comments as above.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26494
@@ +26493,3 @@
+ }
+
+ // Find out the size of N1 if it is a constant splat.
----------------
What happens if one of the extends is a sext and the other is a zext? It seems like if N0 is a sext and N1 is a zext we'll get IsSigned == true, and if it's the other way around, we'll get IsSigned == false, which seems oddly asymmetrical. Or is there a canonicalization earlier that guarantees the order of a sext and zext?
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26499
@@ +26498,3 @@
+ bool HasAnyUndefs;
+ auto *BV = dyn_cast<BuildVectorSDNode>(N1);
+ if (BV &&
----------------
Are you guaranteed that if one of {N0, N1} is a extend, and the other is a BuildVector, N0 is the extend? I can imagine something canonizes it that way - and if that's the case, documenting this here would probably be a good idea.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26519
@@ +26518,3 @@
+ // Shrink the operands of mul.
+ SDValue NewN0 = DAG.getNode(ISD::TRUNCATE, DL, ReducedVT, N0);
+ SDValue NewN1 = DAG.getNode(ISD::TRUNCATE, DL, ReducedVT, N1);
----------------
Why not use N0->getOperand(0), at least when the type is i16?
Although I guess we're pretty much guaranteed DAGCombine will clean this up, and this may be a bit cleaner as is. So, I'm not really sure which is better.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26523
@@ +26522,3 @@
+ // Handle the case when split is needed.
+ if (VT.getVectorNumElements() > OpsVT.getVectorNumElements()) {
+ // Split the nodes.
----------------
This is necessary because this runs before legalization, right?
What if it ran after legalization? Does that help? Or is that too late?
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26544
@@ +26543,3 @@
+ OpsVT, SubN0, SubN1);
+ SDValue ResLo = DAG.getNode(X86ISD::UNPCKL, DL, OpsVT, MulLo, MulHi);
+ ResLo = DAG.getNode(ISD::BITCAST, DL, ResVT, ResLo);
----------------
Why not use a generic shuffle here? Is the lowering further down not good enough to get the right unpacks?
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:29777
@@ -29594,1 +29776,3 @@
+ case ISD::MUL:
+ return combineMul(N, DAG, DCI, Subtarget);
case ISD::SHL:
----------------
Why the linebreak?
================
Comment at: test/CodeGen/X86/shrink_vmul.ll:1
@@ +1,2 @@
+; NOTE: Assertions have been autogenerated by update_llc_test_checks.py
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+sse2 | FileCheck %s
----------------
Could you add sext tests as well? I'm not sure you need the whole type x {zext, sext} matrix, but one sext test would be good.
Repository:
rL LLVM
http://reviews.llvm.org/D20931
More information about the llvm-commits
mailing list