[llvm] r208271 - Lower certain build_vectors to insertps instructions

Filipe Cabecinhas me at filcab.net
Sun May 11 02:47:51 PDT 2014


Hi Nadav,

There was a missing check which triggered a bug when we had registers
bigger than 128 bits. It's fixed and the bug marked as fixed.

Thanks,

  Filipe

On Saturday, May 10, 2014, Nadav Rotem <nrotem at apple.com> wrote:

> Filipe,
>
> It looks like this commit caused:  "Bug 19694 - Mesa llvmpipe lp_test_conv
> regression."
>
> Do you mind taking a look?
>
> Thanks,
> Nadav
>
> On May 7, 2014, at 5:25 PM, Filipe Cabecinhas <me at filcab.net> wrote:
>
> > Author: filcab
> > Date: Wed May  7 19:25:16 2014
> > New Revision: 208271
> >
> > URL: http://llvm.org/viewvc/llvm-project?rev=208271&view=rev
> > Log:
> > Lower certain build_vectors to insertps instructions
> >
> > Summary:
> > Vectors built with zeros and elements in the same order as another
> > (source) vector are optimized to be built using a single insertps
> > instruction.
> > Also optimize when we move one element in a vector to a different place
> > in that vector while zeroing out some of the other elements.
> >
> > Further optimizations are possible, described in TODO comments.
> > I will be implementing at least some of them in the near future.
> >
> > Added some tests for different cases where this optimization triggers.
> >
> > Reviewers: nadav, delena, craig.topper
> >
> > Subscribers: llvm-commits
> >
> > Differential Revision: http://reviews.llvm.org/D3521
> >
> > Modified:
> >    llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
> >    llvm/trunk/test/CodeGen/X86/sse41.ll
> >
> > Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
> > URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=208271&r1=208270&r2=208271&view=diff
> >
> ==============================================================================
> > --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
> > +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed May  7 19:25:16
> 2014
> > @@ -5437,6 +5437,74 @@ static SDValue LowerBuildVectorv8i16(SDV
> >   return V;
> > }
> >
> > +/// LowerBuildVectorv4x32 - Custom lower build_vector of v4i32 or v4f32.
> > +static SDValue LowerBuildVectorv4x32(SDValue Op, unsigned NumElems,
> > +                                     unsigned NonZeros, unsigned
> NumNonZero,
> > +                                     unsigned NumZero, SelectionDAG
> &DAG,
> > +                                     const X86Subtarget *Subtarget,
> > +                                     const TargetLowering &TLI) {
> > +  // We know there's at least one non-zero element
> > +  unsigned FirstNonZeroIdx = 0;
> > +  SDValue FirstNonZero = Op->getOperand(FirstNonZeroIdx);
> > +  while (FirstNonZero.getOpcode() == ISD::UNDEF ||
> > +         X86::isZeroNode(FirstNonZero)) {
> > +    ++FirstNonZeroIdx;
> > +    FirstNonZero = Op->getOperand(FirstNonZeroIdx);
> > +  }
> > +
> > +  if (FirstNonZero.getOpcode() != ISD::EXTRACT_VECTOR_ELT ||
> > +      !isa<ConstantSDNode>(FirstNonZero.getOperand(1)))
> > +    return SDValue();
> > +
> > +  SDValue V = FirstNonZero.getOperand(0);
> > +  unsigned FirstNonZeroDst =
> cast<ConstantSDNode>(FirstNonZero.getOperand(1))->getZExtValue();
> > +  unsigned CorrectIdx = FirstNonZeroDst == FirstNonZeroIdx;
> > +  unsigned IncorrectIdx = CorrectIdx ? -1U : FirstNonZeroIdx;
> > +  unsigned IncorrectDst = CorrectIdx ? -1U : FirstNonZeroDst;
> > +
> > +  for (unsigned Idx = FirstNonZeroIdx + 1; Idx < NumElems; ++Idx) {
> > +    SDValue Elem = Op.getOperand(Idx);
> > +    if (Elem.getOpcode() == ISD::UNDEF || X86::isZeroNode(Elem))
> > +      continue;
> > +
> > +    // TODO: What else can be here? Deal with it.
> > +    if (Elem.getOpcode() != ISD::EXTRACT_VECTOR_ELT)
> > +      return SDValue();
> > +
> > +    // TODO: Some optimizations are still possible here
> > +    // ex: Getting one element from a vector, and the rest from
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140511/9f3069b5/attachment.html>


More information about the llvm-commits mailing list