[llvm-commits] Fix PR 13837: Optimization for vector creation from type conversion of another vector

Wed Sep 19 14:26:27 PDT 2012

Hi Duncan,

Sorry I missed your message.  I was monitoring only the messages that sent
to me :(

Very appreciate for your review and valuable and detailed feedback.

I followed all of your suggests except:
Why restrict yourself to just this set of casts?  
Currently, LLVM supports 4 type conversions: S/U INT <-> FP. And they have
corresponding vector instructions. Do you mean re-interpretation cast?
I also looked at other conversions. For example, for zero_exd/sign_exd, I
tested from char3->int3, but ARM backend generates "AND" operation and no
need vector instructions at all.

I fixed up everything else.
Please help to review them. 

Thanks,
Weiming

-----Original Message-----
From: llvm-commits-bounces at cs.uiuc.edu
[mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Duncan Sands
Sent: Friday, September 14, 2012 1:19 AM
To: llvm-commits at cs.uiuc.edu
Subject: Re: [llvm-commits] Fix PR 13837: Optimization for vector creation
from type conversion of another vector

Hi Weiming,

> When a vector is created from type conversion, it generates multiple 
> scalar conversion instead of a vectorized conversion.
...
> Current LLVM first extracts each element of "in" and then does a 
> fp-to-si for each element and then builds the result vector.
>
> This patch allows a vectorized fp-to-si conversion against the input 
> vector directly, thus results in better performance.

> --- a/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> +++ b/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> @@ -7933,6 +7933,50 @@ SDValue DAGCombiner::visitBUILD_VECTOR(SDNode *N) {
>      AllAnyExt &= AnyExt;
>    }
>
> +  // Check if we can combine cast and build  SmallVector<SDValue, 8> 
> + CastOps;

Since this vector isn't used until later, it can be moved later.

> +
> +  unsigned ElmOp;
> +  EVT CastFromType;
> +  bool HasNonConvOp = false;
> +  for (unsigned i = 0; i != NumInScalars; ++i) {
> +    SDValue In = N->getOperand(i);
> +    ElmOp = In.getOpcode();
> +    if (ElmOp == ISD::UNDEF)
> +      continue;
> +    else if (ElmOp == ISD::SINT_TO_FP || ElmOp == ISD::UINT_TO_FP ||

No need for "else" here (style).

> +      ElmOp == ISD::FP_TO_SINT || ElmOp == ISD::FP_TO_UINT) {

Why restrict yourself to just this set of casts?  Isn't this optimization
worth considering for any cast?

> +      CastFromType = In.getOperand(0).getValueType();
> +      break;
> +    }
> +    else {

No need for "else" here (style).

> +      // Op_i is neither UNDEF nor type conversion
> +      HasNonConvOp = true;
> +      break;
> +    }
> +  }

I think in the above loop you should already check that all casts are of the
same type.  You can do this by initializing ElmOp to DELETED_NODE (i.e.
zero).
Do not set ElmOp if you have an undef node.  If ElmOp is non-zero and is
different to the current node's opcode then reset ElmOp to zero and break.
If ElmOp is non-zero and is equal to the current node's opcode then
continue.
Otherwise, check that you have a valid cast opcode, and if so set ElmOp to
it otherwise break.

You also need to check that all cast operands have the same type.  After
all, one might be a cast of i8 to double and another a cast of i32 to
double.  So I suggest that in parallel with ElmOp you also keep track of
CastFromType in the same way, breaking out if there is a mismatch.

> +
> +  if (!HasNonConvOp) {

If you follow the above scheme then you can get rid of HasNonConvOp, and
just check that ElmOp != DELETED_NODE.

> +    // Since AllOpererndsUnfer has been checked, here ElmOp must be 
> + one of the

AllOpererndsUnfer doesn't exist.

> +    // type conversions.

This is not true in the case when every operand is undef (in which case I
expect you to crash below).  Using the scheme I described above would take
care of this.

  Now, check if the all operands are either of the same
> +    // type conversion or UNDEF.

The CastOps vector can be declared here.  You can reserve NumInScalars space
since you know already that that's the number of elements you will push in.

> +    for (unsigned i = 0; i != NumInScalars; ++i) {
> +        if (N->getOperand(i).getOpcode() == ElmOp)

Wrong indentation (style).

> +          CastOps.push_back(N->getOperand(i).getOperand(0));
> +        else if (N->getOperand(i).getOpcode() == ISD::UNDEF)

With the scheme I described above, if the opcode is not ElmOp then it is
UNDEF, so you here you can just do
            else {
              assert(N->getOperand(i).getOpcode() == ISD::UNDEF);
              CastOps.push_back(DAG.getUNDEF(CastFromType));
            }

> +          CastOps.push_back(DAG.getUNDEF(CastFromType));
> +      else
> +        break;

And this last else would be pointless.

> +    }
> +  }
> +
> +  if (CastOps.size() == NumInScalars) {

This can be moved inside the previous block.  If you follow the scheme I
suggest then in fact this test will never fail if you are in the previous if
block so can be removed.

> +    EVT VecVT = EVT::getVectorVT(*DAG.getContext(), CastFromType,
NumInScalars);
> +    SDValue Vec = DAG.getNode(ISD::BUILD_VECTOR, dl,
> +        VecVT, &CastOps[0], CastOps.size());
> +    return DAG.getNode(ElmOp, dl, VT, Vec);

This may create an illegal operation.  So you should condition this on
either creating illegal operations being allowed (!LegalOperations) or on
the operation being legal (check using TLI.isOperationLegal).

> +  }
> +
>    // In order to have valid types, all of the inputs must be extended
from the
>    // same source type and all of the inputs must be any or zero extend.
>    // Scalar sizes must be a power of two.

As you seem to have missed several failure modes (all operands undef, all
operands doing the same cast operation but with source types of different
size) please add some additional testcases covering these possibilities.

Ciao, Duncan.
_______________________________________________
llvm-commits mailing list
llvm-commits at cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-PR-13837-Combine-Cast-and-Build_Vector-when-possible.patch
Type: application/octet-stream
Size: 11098 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120919/fae90a3d/attachment.obj>