[PATCH] [X86][SSE] Improve matching of SSE blend instructions with splatted vector inputs

Mon Dec 15 14:41:22 PST 2014

No problem - I'll look into moving this into the target-independent dag combiner.

REPOSITORY
  rL LLVM

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:7366-7369
@@ +7365,6 @@
+    if (auto *BVOp = dyn_cast<BuildVectorSDNode>(V.getNode())) {
+      if (BVOp->getConstantSplatNode(&UndefElements) && UndefElements.none())
+        return true;
+      if (BVOp->getConstantFPSplatNode(&UndefElements) && UndefElements.none())
+        return true;
+    }
----------------
chandlerc wrote:
> Why are only constants safe here? Shouldn't it be any buildvector of a single scalar SDValue without undefs?
> 
> I also could have sworn there was already a predicate for this...
I just added the most likely contenders to the safesplat detector. I can find several existing predicates that do some splat testing, mainly within various different targets' specific code, and none that appear to do all the obvious tests. I'll look into putting a more general predicate in the ISD namespace and convert some existing use cases (I've been meaning to do something similar for the zeroable shuffle tests as well).

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:7372-7376
@@ +7371,7 @@
+    if (auto *SVNOp = dyn_cast<ShuffleVectorSDNode>(V.getNode())) {
+      ArrayRef<int> SVNMask = SVNOp->getMask();
+      for (int i = 0, Size = SVNMask.size(); i < Size; ++i)
+        if (SVNMask[i] < 0 || SVNMask[i] != SVNMask[0])
+          return false;
+      return true;
+    }
----------------
chandlerc wrote:
> If we fail to turn this kind of shuffle vector into a buildvector splat, we should fix that in the target independent dag combining, no?
> 
> The only time I've been  unable to do this is when the pattern didn't emerge until *during* lowering. Do you have test cases showing that?
I'll look at adding a dag combiner test to improve partial splats with undefs to a full splat.

No real world test cases that I can recall. The only cases I can think of are ones where we are blending with zero, lowering often raises 'zeroable' lanes to definite zeros, but AFAIK we don't track that in x86 shuffle lowering. But even this could technicaly be raised to the target independent dag combiner.

http://reviews.llvm.org/D6652

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/