[llvm-commits] Patch: Widening in LegalizeType

Mon Ping Wang wangmp at apple.com
Fri Nov 21 01:30:09 PST 2008


Hi Duncan,

On Nov 20, 2008, at 1:06 AM, Duncan Sands wrote:

> Hi Mon Ping,
>
>> @@ -233,6 +233,11 @@
>>                        JoinIntegers(Lo, Hi));
>>     return DAG.getNode(ISD::BIT_CONVERT, OutVT, InOp);
>>   }
>> +  case WidenVector:
>> +    if (OutVT.bitsEq(TLI.getWidenVectorType(InVT)))
>
> You should be able to use NInVT rather than TLI.getWidenVectorType.
> That is because TLI.getTypeToTransformTo should return the same thing
> as TLI.getWidenVectorType (I think TLI.getWidenVectorType should be
> eliminated; I intend to say more about this in a later email).

Given that for any given type, it should be clear if that type is  
legal or should be promoted, expanded, or widened, it make sense to  
have only one function for this.  The only difference in  
getWidenVectorType is that the function is virtual to allow a client  
to decide that for certain vectors, though a legal type exist, that it  
should not widen to that type because it might be cheaper to scalarize  
it.  There is no reason that getTypeToTransform can't do the same  
thing if certain target wants to override the default.  Today, without  
any firm data, the difference between X86 and the default is so  
slight, I'm inclined to have one function and add the virtual one when  
we see a need.

>
>
>> +    case ISD::SCALAR_TO_VECTOR: Res = ExpandIntOp_SCALAR_TO_VECTOR 
>> (N); break;
> ...
>> +SDValue DAGTypeLegalizer::ExpandIntOp_SCALAR_TO_VECTOR(SDNode *N) {
>> +  // Create a vector sized/aligned stack slot, store the value to  
>> element #0,
>> +  // then load the whole vector back out.
>> +  return CreateStackStoreLoad(N->getOperand(0), N->getValueType(0));
>> +}
>
> It seems a pity to just give up and use a stack store load.  Maybe  
> there
> are some tricks...  How about using BUILD_VECTOR on the expanded  
> pieces?
>

You are right.  We should use BUILD_VECTOR like we do in widening.

>> +  case ISD::AND:
>> +  case ISD::OR:
>> +  case ISD::XOR:
>> +  case ISD::ADD:
>> +  case ISD::SUB:
>> +  case ISD::FPOW:
>> +  case ISD::FPOWI:
>> +  case ISD::MUL:
>> +  case ISD::MULHS:
>> +  case ISD::MULHU:
>> +  case ISD::FADD:
>> +  case ISD::FSUB:
>> +  case ISD::FMUL:
>> +  case ISD::SDIV:
>> +  case ISD::SREM:
>> +  case ISD::FDIV:
>> +  case ISD::FREM:
>> +  case ISD::FCOPYSIGN:
>> +  case ISD::UDIV:
>> +  case ISD::UREM:
>> +  case ISD::BSWAP:             Res = WidenVecRes_Binary(N); break;
>
> Can you please put these in alphabetical order.
>

Sure.

>> +SDValue DAGTypeLegalizer::WidenVecRes_Binary(SDNode *N) {
>> +  // Binary op widening
>
> Missing full stop at end of comment.
>
>> +  MVT WidenVT = TLI.getWidenVectorType(N->getValueType(0));
>> +  SDValue InOp1 = GetWidenedVector(N->getOperand(0));
>> +  SDValue InOp2 = GetWidenedVector(N->getOperand(1));
>> +  assert(InOp1.getValueType() == WidenVT && InOp2.getValueType()  
>> == WidenVT);
>> +  return DAG.getNode(N->getOpcode(), WidenVT, InOp1, InOp2);
>
> Please eliminate this assert.  If you want to do this kind of
> checking, best to do it systematically in SetWidenedVector.
> WidenVT can be obtained as the type of InOp1.
>

I'll remove them.

>> +SDValue DAGTypeLegalizer::WidenVecRes_Convert(SDNode *N) {
>
> This routine is identical to WidenVecRes_Unary (except for
> the asserts).  These conversions are just unary operations,
> so please use WidenVecRes_Unary.  See below for some more
> comments.
>
>>

It was originally similar to bit convert where it checks if the  
incoming vectors will be widened to the same number of as the  
resulting widen vector type and if so, use the widen results;  
otherwise do the convert and then widen.  I simplified it to see when  
it will fails but it hasn't yet but I'm sure its because I haven't  
written some vector converts between types of different sizes which  
needs to part of the basic tests for the checkin.  I'm planning to  
change how it operates to handle the more complicated case.

>> +SDValue DAGTypeLegalizer::WidenVecRes_Shift(SDNode *N) {
>> +  // Unary op widening
>> +  MVT WidenVT = TLI.getWidenVectorType(N->getValueType(0));
>> +  SDValue InOp = GetWidenedVector(N->getOperand(0));
>> +  assert(InOp.getValueType() == WidenVT);
>
> I think you should drop this assertion.  It can only fail if
> the original node was invalid (SelectionDAG.cpp should check
> that the two types are the same - if it doesn't then that
> should be fixed).  That also means that the creation of the
> new node will assert if the types are different.
> You can get WidenVT as the type of InOp (more efficient).
>

I agree.

>> +SDValue DAGTypeLegalizer::WidenVecRes_Unary(SDNode *N) {
>> +  // Unary op widening
>> +  MVT WidenVT = TLI.getWidenVectorType(N->getValueType(0));
>> +  SDValue InOp = GetWidenedVector(N->getOperand(0));
>> +  assert(InOp.getValueType() == WidenVT);
>
> I think you should allow for the case when the operand and
> result have different vector types.  Like FP_TO_UINT.  So
> you should have a more general check here. i.e. I think you
> should fold WidenVecRes_Convert into this routine.
>
> By the way, notice a logical consequence: for this routine
> to always work, you more or less have to require **that whether
> a vector is widened or not, and to what width it is widened
> depends only on the number of elements and not the element
> type**.  So I think it is important to decide whether you
> want to require this property or not.  If not, then the logic
> for this routine needs to be made more complicated.  If you
> do want to require this property (I think this would be wise)
> then probably it should be built in fundamentally, in places
> like TargetLowering.  I mean that the underlying routines
> should only be passed the number of vector elements, and not
> the element type - that way the above property is necessarily
> satisfied.
>

When there is a legal type to widen to, we should always widen to that  
type so we can do the associated operation on the wider legal type.   
Given that different types might widen different  (e.g., v4i32 and  
v16i8 are legal types on X86), we must be able to handle the case when  
the two different types might widen to different lengths.

I think that I want to keep two separate classes of routines.  The  
function whose operands and result type must match can have the simple  
logic like in unary and binary.  For routines like conversions, we  
will have more complicated logic where we check if the input and  
output are going to have the same number of elements and if not, we  
have to do the right thing.


>> +SDValue DAGTypeLegalizer::WidenVecRes_BIT_CONVERT(SDNode *N) {
>> +  SDValue InOp = N->getOperand(0);
>> +  MVT InVT = InOp.getValueType();
>> +  MVT VT = N->getValueType(0);
>> +  MVT WidenVT = TLI.getWidenVectorType(VT);
>
> Please use TLI.getTypeToTransformTo.  This keeps everything
> uniform with other legalization methods.
>

I'll will do so and also change LegalizeDAG to use this function as  
well so I can get rid of it and we can start thinking of removing  
widening from LegalizeDAG.

>> +  case WidenVector:
>> +    if (WidenVT.bitsEq(TLI.getWidenVectorType(InVT)))
>
> Likewise (I won't mention this again).
>
>> +  // Otherwise, do the convert and then widen.
>> +  SmallVector<SDValue, 16> Ops;
>> +  if (WidenVT.getSizeInBits() % VT.getSizeInBits()) {
>
> Isn't this test inverted?

  I looked in my build that I was testing and there is a "== 0"

>
>
>> +    unsigned NumConcat = WidenVT.getSizeInBits() / VT.getSizeInBits 
>> ();
>> +    SDValue UndefVal = DAG.getNode(ISD::UNDEF, VT);
>> +    Ops.push_back(SDValue(N,0));
>
> ^ Isn't this going to lead to an infinite type legalization loop?
> You can just reuse the node in a different context, you have to
> change it, you have to make progress towards legalization of N.
>

I looked into some test cases where I thought I would hit that case  
for BIT_CONVERT but I don't see it on X86, which make sense after I  
thought about it. When we widen to a legal type, what happens is that  
both sides are widen to the same vector register so the end result is  
that both sides have the same size.  In cases were we extend a vector  
to a power of 2, both vectors are extended to the next power of 2 and  
then also end up to be the same size so we never hit this case.

>> +    SDValue CvtOp = SDValue(N,0);
>
> Likewise.  I guess I must be confused since you surely tested this,
> but I don't see how this can possibly work.  Anyway, if I'm not
> confused then I think you should do the following:
>
> case (a).  The input is a scalar (i.e. not a vector).  Try to use
> scalar_to_vector to build a vector with the same bitwidth as WidenVT
> out of InOp.  Return a bitconvert of this vector to the WidenVT.
> This may not always be possible.  If not, you may have to go to the
> stack, see case (b) below.
>
> case (b).  The input is a vector.  Here is where an "extend
> by undef" operation for vectors would be extremely useful
> (analogue of scalar_to_vector) :)  Lacking that, I suggest
> you create a stack temporary of the size of WidenVT, store
> InOp to it, and load it out as WidenVT.  That said, there
> are probably a few cases you can handle more optimally.

Today, we extend by undef through either concatenate undefs to the  
input or do extract and build vector.  I would do that way for now to  
avoid doing the stack  temporary.  My concern with the stack temporary  
is that we would generate the store and load, eventually split the  
vector down to scalar operations, and never clean up the store and  
load (unless some phase after legalization will do store load  
elimination).

>> +  SmallVector<SDValue, 16> NewOps(N->op_begin(), N->op_end());
>
> Since you know the final size, you can reserve space in the
> vector, to avoid reallocations.

>
>
>> +SDValue DAGTypeLegalizer::WidenVecRes_CONCAT_VECTORS(SDNode *N) {
>> +  // We currently support concatenating a multiple of the incoming  
>> vector. We
>> +  // could easily support widening an arbitrary amount using  
>> extract/build but
>> +  // I want to see a use case before we do so.
>
> I'm not sure I understand this comment.  I guess what you are trying
> to say is that v6i32 = CONCAT(v3i32, v3i32), widens to v8i32, is not
> supported yet?

Yes, I'm stating that we don't support those cases where the result is  
not a multiple of the input.

>
>
>> +  SmallVector<SDValue, 16> Ops;
>
> You can reserve space.
>
>> +  Ops.push_back(SDValue(N,0));
>
> Shouldn't you be pushing back the operands of N?!

>> +  for (unsigned i = 1; i != NumConcat; ++i) {
>> +    Ops.push_back(UndefVal);
>> +  }
>
> Shouldn't you be pushing back NumConcat - N->getNumOperands() - 1
> copies of undef?  Also, the braces {} are redundant.
>

I'll fix the above problems.  I'm in the process of writing a more  
white box tests to capture some of the corner cases that I  don't see  
when running the test.


>> +SDValue DAGTypeLegalizer::WidenVecRes_CONVERT_RNDSAT(SDNode *N) {

> I didn't review this function.
>
> At this point I got tired and went to bed!  I will take a look at
> the rest some other time.
>

The patch is very long and I appreciate the time and effort for the  
review.

Thanks,
   -- Mon Ping


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20081121/cb1674b6/attachment.html>


More information about the llvm-commits mailing list