[llvm-commits] Fix PR 13837: Optimization for vector creation from type conversion of another vector

Thu Sep 20 10:47:35 PDT 2012

Hi Duncan,

I added two cases: FNEG and TRUNCATE (narrowing) and also added
corresponding test cases. (see attached patch)
For widening operations (zext, sext) like i32->i64 and float -> float64,
since there's no corresponding vector instr. on ARM, so I didn't add them.

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by
The Linux Foundation

-----Original Message-----
From: Duncan Sands [mailto:duncan.sands at gmail.com] On Behalf Of Duncan Sands
Sent: Thursday, September 20, 2012 12:21 AM
To: Weiming Zhao
Cc: llvm-commits at cs.uiuc.edu
Subject: Re: [llvm-commits] Fix PR 13837: Optimization for vector creation
from type conversion of another vector

Hi Weiming,

> Sorry I missed your message.  I was monitoring only the messages that 
> sent to me :(

oops, sorry about that.

> Very appreciate for your review and valuable and detailed feedback.
>
> I followed all of your suggests except:
> Why restrict yourself to just this set of casts?
> Currently, LLVM supports 4 type conversions: S/U INT <-> FP.

I had in mind things like zext from i32 to i64 and other integer to integer
or floating point to floating point casts.  It seems to me that this kind of
transformation could be helpful for them too.

I will comment on your new patch later.

Ciao, Duncan.

  And they have
> corresponding vector instructions. Do you mean re-interpretation cast?
> I also looked at other conversions. For example, for 
> zero_exd/sign_exd, I tested from char3->int3, but ARM backend 
> generates "AND" operation and no need vector instructions at all.
>
> I fixed up everything else.
> Please help to review them.
>
> Thanks,
> Weiming
>
>
>
>
> -----Original Message-----
> From: llvm-commits-bounces at cs.uiuc.edu 
> [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Duncan Sands
> Sent: Friday, September 14, 2012 1:19 AM
> To: llvm-commits at cs.uiuc.edu
> Subject: Re: [llvm-commits] Fix PR 13837: Optimization for vector 
> creation from type conversion of another vector
>
> Hi Weiming,
>
>> When a vector is created from type conversion, it generates multiple 
>> scalar conversion instead of a vectorized conversion.
> ...
>> Current LLVM first extracts each element of "in" and then does a 
>> fp-to-si for each element and then builds the result vector.
>>
>> This patch allows a vectorized fp-to-si conversion against the input 
>> vector directly, thus results in better performance.
>
>> --- a/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>> +++ b/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>> @@ -7933,6 +7933,50 @@ SDValue DAGCombiner::visitBUILD_VECTOR(SDNode *N)
{
>>       AllAnyExt &= AnyExt;
>>     }
>>
>> +  // Check if we can combine cast and build  SmallVector<SDValue, 8> 
>> + CastOps;
>
> Since this vector isn't used until later, it can be moved later.
>
>> +
>> +  unsigned ElmOp;
>> +  EVT CastFromType;
>> +  bool HasNonConvOp = false;
>> +  for (unsigned i = 0; i != NumInScalars; ++i) {
>> +    SDValue In = N->getOperand(i);
>> +    ElmOp = In.getOpcode();
>> +    if (ElmOp == ISD::UNDEF)
>> +      continue;
>> +    else if (ElmOp == ISD::SINT_TO_FP || ElmOp == ISD::UINT_TO_FP ||
>
> No need for "else" here (style).
>
>> +      ElmOp == ISD::FP_TO_SINT || ElmOp == ISD::FP_TO_UINT) {
>
> Why restrict yourself to just this set of casts?  Isn't this 
> optimization worth considering for any cast?
>
>> +      CastFromType = In.getOperand(0).getValueType();
>> +      break;
>> +    }
>> +    else {
>
> No need for "else" here (style).
>
>> +      // Op_i is neither UNDEF nor type conversion
>> +      HasNonConvOp = true;
>> +      break;
>> +    }
>> +  }
>
> I think in the above loop you should already check that all casts are 
> of the same type.  You can do this by initializing ElmOp to DELETED_NODE
(i.e.
> zero).
> Do not set ElmOp if you have an undef node.  If ElmOp is non-zero and 
> is different to the current node's opcode then reset ElmOp to zero and
break.
> If ElmOp is non-zero and is equal to the current node's opcode then 
> continue.
> Otherwise, check that you have a valid cast opcode, and if so set 
> ElmOp to it otherwise break.
>
> You also need to check that all cast operands have the same type.  
> After all, one might be a cast of i8 to double and another a cast of 
> i32 to double.  So I suggest that in parallel with ElmOp you also keep 
> track of CastFromType in the same way, breaking out if there is a
mismatch.
>
>> +
>> +  if (!HasNonConvOp) {
>
> If you follow the above scheme then you can get rid of HasNonConvOp, 
> and just check that ElmOp != DELETED_NODE.
>
>> +    // Since AllOpererndsUnfer has been checked, here ElmOp must be 
>> + one of the
>
> AllOpererndsUnfer doesn't exist.
>
>> +    // type conversions.
>
> This is not true in the case when every operand is undef (in which 
> case I expect you to crash below).  Using the scheme I described above 
> would take care of this.
>
>    Now, check if the all operands are either of the same
>> +    // type conversion or UNDEF.
>
> The CastOps vector can be declared here.  You can reserve NumInScalars 
> space since you know already that that's the number of elements you will
push in.
>
>> +    for (unsigned i = 0; i != NumInScalars; ++i) {
>> +        if (N->getOperand(i).getOpcode() == ElmOp)
>
> Wrong indentation (style).
>
>> +          CastOps.push_back(N->getOperand(i).getOperand(0));
>> +        else if (N->getOperand(i).getOpcode() == ISD::UNDEF)
>
> With the scheme I described above, if the opcode is not ElmOp then it 
> is UNDEF, so you here you can just do
>              else {
>                assert(N->getOperand(i).getOpcode() == ISD::UNDEF);
>                CastOps.push_back(DAG.getUNDEF(CastFromType));
>              }
>
>> +          CastOps.push_back(DAG.getUNDEF(CastFromType));
>> +      else
>> +        break;
>
> And this last else would be pointless.
>
>> +    }
>> +  }
>> +
>> +  if (CastOps.size() == NumInScalars) {
>
> This can be moved inside the previous block.  If you follow the scheme 
> I suggest then in fact this test will never fail if you are in the 
> previous if block so can be removed.
>
>> +    EVT VecVT = EVT::getVectorVT(*DAG.getContext(), CastFromType,
> NumInScalars);
>> +    SDValue Vec = DAG.getNode(ISD::BUILD_VECTOR, dl,
>> +        VecVT, &CastOps[0], CastOps.size());
>> +    return DAG.getNode(ElmOp, dl, VT, Vec);
>
> This may create an illegal operation.  So you should condition this on 
> either creating illegal operations being allowed (!LegalOperations) or 
> on the operation being legal (check using TLI.isOperationLegal).
>
>> +  }
>> +
>>     // In order to have valid types, all of the inputs must be 
>> extended
> from the
>>     // same source type and all of the inputs must be any or zero extend.
>>     // Scalar sizes must be a power of two.
>
> As you seem to have missed several failure modes (all operands undef, 
> all operands doing the same cast operation but with source types of 
> different
> size) please add some additional testcases covering these possibilities.
>
> Ciao, Duncan.
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-PR-13837-Combine-Cast-and-Build_Vector-when-possible.patch
Type: application/octet-stream
Size: 12944 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120920/f3d884e2/attachment.obj>