[llvm-commits] Fix PR 13837: Optimization for vector creation from type conversion of another vector

Fri Sep 14 01:18:31 PDT 2012

Hi Weiming,

> When a vector is created from type conversion, it generates multiple scalar
> conversion instead of a vectorized conversion.
...
> Current LLVM first extracts each element of “in” and then does a fp-to-si for
> each element and then builds the result vector.
>
> This patch allows a vectorized fp-to-si conversion against the input vector
> directly, thus results in better performance.

> --- a/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> +++ b/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> @@ -7933,6 +7933,50 @@ SDValue DAGCombiner::visitBUILD_VECTOR(SDNode *N) {
>      AllAnyExt &= AnyExt;
>    }
>
> +  // Check if we can combine cast and build
> +  SmallVector<SDValue, 8> CastOps;

Since this vector isn't used until later, it can be moved later.

> +
> +  unsigned ElmOp;
> +  EVT CastFromType;
> +  bool HasNonConvOp = false;
> +  for (unsigned i = 0; i != NumInScalars; ++i) {
> +    SDValue In = N->getOperand(i);
> +    ElmOp = In.getOpcode();
> +    if (ElmOp == ISD::UNDEF)
> +      continue;
> +    else if (ElmOp == ISD::SINT_TO_FP || ElmOp == ISD::UINT_TO_FP ||

No need for "else" here (style).

> +      ElmOp == ISD::FP_TO_SINT || ElmOp == ISD::FP_TO_UINT) {

Why restrict yourself to just this set of casts?  Isn't this optimization worth
considering for any cast?

> +      CastFromType = In.getOperand(0).getValueType();
> +      break;
> +    }
> +    else {

No need for "else" here (style).

> +      // Op_i is neither UNDEF nor type conversion
> +      HasNonConvOp = true;
> +      break;
> +    }
> +  }

I think in the above loop you should already check that all casts are of the
same type.  You can do this by initializing ElmOp to DELETED_NODE (i.e. zero).
Do not set ElmOp if you have an undef node.  If ElmOp is non-zero and is
different to the current node's opcode then reset ElmOp to zero and break.
If ElmOp is non-zero and is equal to the current node's opcode then continue.
Otherwise, check that you have a valid cast opcode, and if so set ElmOp to it
otherwise break.

You also need to check that all cast operands have the same type.  After all,
one might be a cast of i8 to double and another a cast of i32 to double.  So
I suggest that in parallel with ElmOp you also keep track of CastFromType in
the same way, breaking out if there is a mismatch.

> +
> +  if (!HasNonConvOp) {

If you follow the above scheme then you can get rid of HasNonConvOp, and just
check that ElmOp != DELETED_NODE.

> +    // Since AllOpererndsUnfer has been checked, here ElmOp must be one of the

AllOpererndsUnfer doesn't exist.

> +    // type conversions.

This is not true in the case when every operand is undef (in which case I expect
you to crash below).  Using the scheme I described above would take care of
this.

  Now, check if the all operands are either of the same
> +    // type conversion or UNDEF.

The CastOps vector can be declared here.  You can reserve NumInScalars space
since you know already that that's the number of elements you will push in.

> +    for (unsigned i = 0; i != NumInScalars; ++i) {
> +        if (N->getOperand(i).getOpcode() == ElmOp)

Wrong indentation (style).

> +          CastOps.push_back(N->getOperand(i).getOperand(0));
> +        else if (N->getOperand(i).getOpcode() == ISD::UNDEF)

With the scheme I described above, if the opcode is not ElmOp then it is
UNDEF, so you here you can just do
            else {
              assert(N->getOperand(i).getOpcode() == ISD::UNDEF);
              CastOps.push_back(DAG.getUNDEF(CastFromType));
            }

> +          CastOps.push_back(DAG.getUNDEF(CastFromType));
> +      else
> +        break;

And this last else would be pointless.

> +    }
> +  }
> +
> +  if (CastOps.size() == NumInScalars) {

This can be moved inside the previous block.  If you follow the scheme I suggest
then in fact this test will never fail if you are in the previous if block so
can be removed.

> +    EVT VecVT = EVT::getVectorVT(*DAG.getContext(), CastFromType, NumInScalars);
> +    SDValue Vec = DAG.getNode(ISD::BUILD_VECTOR, dl,
> +        VecVT, &CastOps[0], CastOps.size());
> +    return DAG.getNode(ElmOp, dl, VT, Vec);

This may create an illegal operation.  So you should condition this on either
creating illegal operations being allowed (!LegalOperations) or on the operation
being legal (check using TLI.isOperationLegal).

> +  }
> +
>    // In order to have valid types, all of the inputs must be extended from the
>    // same source type and all of the inputs must be any or zero extend.
>    // Scalar sizes must be a power of two.

As you seem to have missed several failure modes (all operands undef, all
operands doing the same cast operation but with source types of different
size) please add some additional testcases covering these possibilities.

Ciao, Duncan.