[llvm] r208271 - Lower certain build_vectors to insertps instructions

Nadav Rotem nrotem at apple.com
Sun May 11 04:44:19 PDT 2014


Excellent. Thank you Filipe. 


> On May 11, 2014, at 2:47, Filipe Cabecinhas <me at filcab.net> wrote:
> 
> Hi Nadav,
> 
> There was a missing check which triggered a bug when we had registers bigger than 128 bits. It's fixed and the bug marked as fixed.
> 
> Thanks,
> 
>   Filipe
> 
>> On Saturday, May 10, 2014, Nadav Rotem <nrotem at apple.com> wrote:
>> Filipe,
>> 
>> It looks like this commit caused:  "Bug 19694 - Mesa llvmpipe lp_test_conv regression."
>> 
>> Do you mind taking a look?
>> 
>> Thanks,
>> Nadav
>> 
>> On May 7, 2014, at 5:25 PM, Filipe Cabecinhas <me at filcab.net> wrote:
>> 
>> > Author: filcab
>> > Date: Wed May  7 19:25:16 2014
>> > New Revision: 208271
>> >
>> > URL: http://llvm.org/viewvc/llvm-project?rev=208271&view=rev
>> > Log:
>> > Lower certain build_vectors to insertps instructions
>> >
>> > Summary:
>> > Vectors built with zeros and elements in the same order as another
>> > (source) vector are optimized to be built using a single insertps
>> > instruction.
>> > Also optimize when we move one element in a vector to a different place
>> > in that vector while zeroing out some of the other elements.
>> >
>> > Further optimizations are possible, described in TODO comments.
>> > I will be implementing at least some of them in the near future.
>> >
>> > Added some tests for different cases where this optimization triggers.
>> >
>> > Reviewers: nadav, delena, craig.topper
>> >
>> > Subscribers: llvm-commits
>> >
>> > Differential Revision: http://reviews.llvm.org/D3521
>> >
>> > Modified:
>> >    llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
>> >    llvm/trunk/test/CodeGen/X86/sse41.ll
>> >
>> > Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
>> > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=208271&r1=208270&r2=208271&view=diff
>> > ==============================================================================
>> > --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
>> > +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed May  7 19:25:16 2014
>> > @@ -5437,6 +5437,74 @@ static SDValue LowerBuildVectorv8i16(SDV
>> >   return V;
>> > }
>> >
>> > +/// LowerBuildVectorv4x32 - Custom lower build_vector of v4i32 or v4f32.
>> > +static SDValue LowerBuildVectorv4x32(SDValue Op, unsigned NumElems,
>> > +                                     unsigned NonZeros, unsigned NumNonZero,
>> > +                                     unsigned NumZero, SelectionDAG &DAG,
>> > +                                     const X86Subtarget *Subtarget,
>> > +                                     const TargetLowering &TLI) {
>> > +  // We know there's at least one non-zero element
>> > +  unsigned FirstNonZeroIdx = 0;
>> > +  SDValue FirstNonZero = Op->getOperand(FirstNonZeroIdx);
>> > +  while (FirstNonZero.getOpcode() == ISD::UNDEF ||
>> > +         X86::isZeroNode(FirstNonZero)) {
>> > +    ++FirstNonZeroIdx;
>> > +    FirstNonZero = Op->getOperand(FirstNonZeroIdx);
>> > +  }
>> > +
>> > +  if (FirstNonZero.getOpcode() != ISD::EXTRACT_VECTOR_ELT ||
>> > +      !isa<ConstantSDNode>(FirstNonZero.getOperand(1)))
>> > +    return SDValue();
>> > +
>> > +  SDValue V = FirstNonZero.getOperand(0);
>> > +  unsigned FirstNonZeroDst = cast<ConstantSDNode>(FirstNonZero.getOperand(1))->getZExtValue();
>> > +  unsigned CorrectIdx = FirstNonZeroDst == FirstNonZeroIdx;
>> > +  unsigned IncorrectIdx = CorrectIdx ? -1U : FirstNonZeroIdx;
>> > +  unsigned IncorrectDst = CorrectIdx ? -1U : FirstNonZeroDst;
>> > +
>> > +  for (unsigned Idx = FirstNonZeroIdx + 1; Idx < NumElems; ++Idx) {
>> > +    SDValue Elem = Op.getOperand(Idx);
>> > +    if (Elem.getOpcode() == ISD::UNDEF || X86::isZeroNode(Elem))
>> > +      continue;
>> > +
>> > +    // TODO: What else can be here? Deal with it.
>> > +    if (Elem.getOpcode() != ISD::EXTRACT_VECTOR_ELT)
>> > +      return SDValue();
>> > +
>> > +    // TODO: Some optimizations are still possible here
>> > +    // ex: Getting one element from a vector, and the rest from
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140511/2e993d1a/attachment.html>


More information about the llvm-commits mailing list