[llvm] r348522 - [DagCombiner][X86] Simplify a ConcatVectors of a scalar_to_vector with undef.

Thu Dec 6 11:55:38 PST 2018

Author: adibiagio
Date: Thu Dec  6 11:55:38 2018
New Revision: 348522

URL: http://llvm.org/viewvc/llvm-project?rev=348522&view=rev
Log:
[DagCombiner][X86] Simplify a ConcatVectors of a scalar_to_vector with undef.

This patch introduces a new DAGCombiner rule to simplify concat_vectors nodes:

concat_vectors( bitcast (scalar_to_vector %A), UNDEF)
    --> bitcast (scalar_to_vector %A)

This patch only partially addresses PR39257. In particular, it is enough to fix
one of the two problematic cases mentioned in PR39257. However, it is not enough
to fix the original test case posted by Craig; that particular case would
probably require a more complicated approach (and knowledge about used bits).

Before this patch, we used to generate the following code for function PR39257
(-mtriple=x86_64 , -mattr=+avx):

vmovsd  (%rdi), %xmm0           # xmm0 = mem[0],zero
vxorps  %xmm1, %xmm1, %xmm1
vblendps        $3, %xmm0, %xmm1, %xmm0 # xmm0 = xmm0[0,1],xmm1[2,3]
vmovaps %ymm0, (%rsi)
vzeroupper
retq

Now we generate this:

vmovsd  (%rdi), %xmm0           # xmm0 = mem[0],zero
vmovaps %ymm0, (%rsi)
vzeroupper
retq

As a side note: that VZEROUPPER is completely redundant...

I guess the vzeroupper insertion pass doesn't realize that the definition of
%xmm0 from vmovsd is already zeroing the upper half of %ymm0. Note that on
%-mcpu=btver2, we don't get that vzeroupper because pass vzeroupper insertion
%pass is disabled.

Differential Revision: https://reviews.llvm.org/D55274

Modified:
    llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
    llvm/trunk/test/CodeGen/X86/combine-concatvectors.ll

Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=348522&r1=348521&r2=348522&view=diff
==============================================================================

--- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original)
+++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Thu Dec  6 11:55:38 2018
@@ -16518,11 +16518,19 @@ SDValue DAGCombiner::visitCONCAT_VECTORS
     SDValue In = N->getOperand(0);
     assert(In.getValueType().isVector() && "Must concat vectors");
 
-    // Transform: concat_vectors(scalar, undef) -> scalar_to_vector(sclr).
-    if (In->getOpcode() == ISD::BITCAST &&
-        !In->getOperand(0).getValueType().isVector()) {
-      SDValue Scalar = In->getOperand(0);
+    SDValue Scalar = peekThroughOneUseBitcasts(In);
 
+    // concat_vectors(scalar_to_vector(scalar), undef) ->
+    //     scalar_to_vector(scalar)
+    if (!LegalOperations && Scalar.getOpcode() == ISD::SCALAR_TO_VECTOR &&
+         Scalar.hasOneUse()) {
+      EVT SVT = Scalar.getValueType().getVectorElementType();
+      if (SVT == Scalar.getOperand(0).getValueType())
+        Scalar = Scalar.getOperand(0);
+    }
+
+    // concat_vectors(scalar, undef) -> scalar_to_vector(scalar)
+    if (!Scalar.getValueType().isVector()) {
       // If the bitcast type isn't legal, it might be a trunc of a legal type;
       // look through the trunc so we can still do the transform:
       //   concat_vectors(trunc(scalar), undef) -> scalar_to_vector(scalar)

Modified: llvm/trunk/test/CodeGen/X86/combine-concatvectors.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/combine-concatvectors.ll?rev=348522&r1=348521&r2=348522&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/combine-concatvectors.ll (original)
+++ llvm/trunk/test/CodeGen/X86/combine-concatvectors.ll Thu Dec  6 11:55:38 2018
@@ -5,8 +5,6 @@ define void @PR32957(<2 x float>* %in, <
 ; CHECK-LABEL: PR32957:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    vmovsd {{.*#+}} xmm0 = mem[0],zero
-; CHECK-NEXT:    vxorps %xmm1, %xmm1, %xmm1
-; CHECK-NEXT:    vblendps {{.*#+}} xmm0 = xmm0[0,1],xmm1[2,3]
 ; CHECK-NEXT:    vmovaps %ymm0, (%rsi)
 ; CHECK-NEXT:    vzeroupper
 ; CHECK-NEXT:    retq