[llvm-commits] [PATCH] extend fix to PR12312 to support 256-bit vector

Michael Liao michael.liao at intel.com
Mon Sep 10 09:44:44 PDT 2012


On Mon, 2012-09-10 at 12:39 +0200, Duncan Sands wrote:
> Hi Michael,
> 
> > Fix to PR12312 only supports 128-bit vector. This patch extends the
> > support to 256-bit vector as well. The patch includes 2 major changes:
> >
> > + one change to, when bitcasting 256-bit vector (or wider) to 256-bit
> > integer, allow type legalization to extract elements instead of storing
> > whole vector followed by loading each element.
> >    Suppose a N-bit vector into a N-bit integer. Type legalization will
> > try cast that N-bit vector into <2 x (N/2)> vector first and extract the
> > lo/hi part. When N = 128, <2 x (N/2>) is legal. But, when N =256, <2 x
> > i128> is illegal (on x86) and type legalization falls back to store/load
> > approach.
> >    When <2 x (N/2)> is found illegal, this patch tries to check <M x
> > ExtVT>, where ExtVT is the type which N-bit integer is extended onto,
> > e.g. on x86, 256-bit integer will be extended to 64-bit integers and we
> > will try <4 x i64>. If <M x ExtVT> is legal, we will extract elements
> > from it and build N/2-bit lo/hi parts.
> 
> ...
> 
> > +
> > +    // If <NOutVT x 2> is not a legal type, try <ExtOutVT x (2*f)>, where
> > +    // f = bits(NOutVT)/bits(ExtOutVT).
> 
> How about unifying the following code and the code just above (which does
> the f == 1 case)?  For example, by looping over the two values for NVT,
> and if one is legal then executing the code you added, which can handle
> both cases (maybe with a little tweaking), right?  In fact it can be made
> even more general:
> 
> Observe that it isn't very relevant that OutVT is illegal with operand Expand.
> What matters is that the vector type NVT built from it (<2 x i128>, <4 x i64>
> etc) is legal.  So rather than applying getTypeToExpandTo/getTypeToTransform to
> to OutVT, how about ignoring OutVT and starting from the original vector value
> type (<1 x i256>) and systematically double the number of vector elements while
> halving the element size, looking for a legal vector type: <2 x i128>,
> <4 x i64>, <8 x i32>, <16 x i16>, <32 x i8>.  When (if) you find that one of
> these is legal, then apply the transform to it.

Good suggestion, I will submit this change separately. It would be nice
to check whether EXTRACT_VECTOR_ELT on that type is legal as well.
Otherwise, the target may still fall back to store/load approach
eventually.

Yous
- Michael

> 
> > +    EVT ExtOutVT = TLI.getTypeToExpandTo(*DAG.getContext(), OutVT);
> > +    unsigned NumElems = NOutVT.getSizeInBits() / ExtOutVT.getSizeInBits();
> > +
> > +    assert(isPowerOf2_32(NumElems) &&
> > +           "Type expanded to is larger than type transformed to.");
> > +
> > +    NumElems <<= 1;
> > +    NVT = EVT::getVectorVT(*DAG.getContext(), ExtOutVT, NumElems);
> > +
> > +    if (isTypeLegal(NVT)) {
> > +      SDValue CastInOp = DAG.getNode(ISD::BITCAST, dl, NVT, InOp);
> > +
> > +      SmallVector<SDValue, 64> Vals;
> > +      for (unsigned i = 0; i < NumElems; ++i)
> > +        Vals.push_back(DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, ExtOutVT,
> > +                                   CastInOp, DAG.getIntPtrConstant(i)));
> > +
> > +      // Build Lo, Hi pair by pairing extracted elements.
> > +      unsigned Slot = 0;
> > +      for (unsigned e = Vals.size(); e - Slot > 2; Slot += 2, e += 1) {
> > +        // Each iteration will BUILD_PAIR two nodes and append the result until
> > +        // there are only two nodes left, i.e. Lo and Hi.
> > +        SDValue LHS = Vals[Slot];
> > +        SDValue RHS = Vals[Slot + 1];
> > +        Vals.push_back(DAG.getNode(ISD::BUILD_PAIR, dl,
> > +                                   EVT::getIntegerVT(
> > +                                     *DAG.getContext(),
> > +                                     LHS.getValueType().getSizeInBits() << 1),
> > +                                   LHS, RHS));
> > +      }
> > +      Lo = Vals[Slot++];
> > +      Hi = Vals[Slot++];
> > +
> > +      if (TLI.isBigEndian())
> > +        std::swap(Lo, Hi);
> 
> On a big endian machine don't you need to reverse everything you built, not just
> the last two?
> 
> Ciao, Duncan.
> 
> > +
> > +      return;
> > +    }
> >    }
> 
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits





More information about the llvm-commits mailing list