[llvm-commits] [PATCH] Fix instruction selection for vduplane from v4f32 to v2f32

Thu Oct 11 03:08:58 PDT 2012

Ping.
Is this OK to commit?

Thanks,
Silviu

> -----Original Message-----
> From: Silviu Baranga [mailto:silbar01 at arm.com]
> Sent: 04 October 2012 12:08
> To: 'llvm-commits at cs.uiuc.edu'
> Cc: 'David Peixotto'
> Subject: RE: [llvm-commits] [PATCH] Fix instruction selection for
> vduplane from v4f32 to v2f32
> 
> Hi all,
> 
> It would be good to get another review for this. David
> started this thread so I don't know how OK it is to have
> him as the reviewer.
> 
> Thanks,
> Silviu
> 
> > -----Original Message-----
> > From: Silviu Baranga [mailto:silbar01 at arm.com]
> > Sent: 02 October 2012 16:29
> > To: 'David Peixotto'
> > Cc: llvm-commits at cs.uiuc.edu
> > Subject: RE: [llvm-commits] [PATCH] Fix instruction selection for
> > vduplane from v4f32 to v2f32
> >
> > Hi David,
> >
> > Thanks for the comments. The register coalescer should remove
> > the unnecessary copy instructions. We just need to make sure
> > that the index at which we are inserting is right for this
> > purpose.
> >
> > I've attached the updated patch.
> >
> > Cheers,
> > Silviu
> >
> > > -----Original Message-----
> > > From: David Peixotto [mailto:dpeixott at codeaurora.org]
> > > Sent: 01 October 2012 18:06
> > > To: Silviu Baranga; James Molloy
> > > Cc: llvm-commits at cs.uiuc.edu
> > > Subject: RE: [llvm-commits] [PATCH] Fix instruction selection for
> > > vduplane from v4f32 to v2f32
> > >
> > > Hi Silviu,
> > >
> > > Thanks for the patch!
> > >
> > > It looks pretty good to me. I have a question about how we ensure
> the
> > > copy gets eliminated (e.g. the INSERT_ELEMENT into the undef
> vector).
> > > Does this rely on a downstream dag combine optimization, or is it
> > done
> > > in instruction selection? Can you add a note to your comment saying
> > > where you expect this node will be eliminated?
> > >
> > > I have a few minor inline comments below. Other than that it looks
> > good
> > > to me.
> > >
> > >
> > > Index: test/CodeGen/ARM/vdup.ll
> > > ===================================================================
> > > --- test/CodeGen/ARM/vdup.ll	(revision 164943)
> > > +++ test/CodeGen/ARM/vdup.ll	(working copy)
> > > @@ -295,3 +295,39 @@
> > >    %4 = insertelement <4 x i32> %3, i32 255, i32 3
> > >    ret <4 x i32> %4
> > >  }
> > > +
> > > +define <2 x float> @check_f32(<4 x float> %v) nounwind {
> > > +;CHECK: check_f32
> > >
> > > Should have a : after the check_f32 to ensure that we have found
> the
> > > function symbol. I learned this tip after I made the first patch.
> > Sorry
> > > I missed it the first time around.
> > >
> > > +;CHECK: vdup.32 {{.*}}, d{{..}}[1]
> > > +  %x = extractelement <4 x float> %v, i32 3
> > > +  %1 = insertelement  <2 x float> undef, float %x, i32 0
> > > +  %2 = insertelement  <2 x float> %1, float %x, i32 1
> > > +  ret <2 x float> %2
> > > +}
> > > +
> > > +define <2 x i32> @check_i32(<4 x i32> %v) nounwind {
> > > +;CHECK: check_i32
> > >
> > > Add :
> > >
> > > +;CHECK: vdup.32 {{.*}}, d{{..}}[1]
> > > +  %x = extractelement <4 x i32> %v, i32 3
> > > +  %1 = insertelement  <2 x i32> undef, i32 %x, i32 0
> > > +  %2 = insertelement  <2 x i32> %1, i32 %x, i32 1
> > > +  ret <2 x i32> %2
> > > +}
> > > +
> > > +define <4 x i16> @check_i16(<8 x i16> %v) nounwind {
> > > +;CHECK: check_i16
> > >
> > > Add :
> > >
> > > +;CHECK: vdup.16 {{.*}}, d{{..}}[3]
> > > +  %x = extractelement <8 x i16> %v, i32 3
> > > +  %1 = insertelement  <4 x i16> undef, i16 %x, i32 0
> > > +  %2 = insertelement  <4 x i16> %1, i16 %x, i32 1
> > > +  ret <4 x i16> %2
> > > +}
> > > +
> > > +define <8 x i8> @check_i8(<16 x i8> %v) nounwind {
> > > +;CHECK: check_i8
> > >
> > > Add :
> > >
> > > +;CHECK: vdup.8 {{.*}}, d{{..}}[3]
> > > +  %x = extractelement <16 x i8> %v, i32 3
> > > +  %1 = insertelement  <8  x i8> undef, i8 %x, i32 0
> > > +  %2 = insertelement  <8  x i8> %1, i8 %x, i32 1
> > > +  ret <8 x i8> %2
> > > +}
> > > Index: lib/Target/ARM/ARMISelLowering.cpp
> > > ===================================================================
> > > --- lib/Target/ARM/ARMISelLowering.cpp	(revision 164943)
> > > +++ lib/Target/ARM/ARMISelLowering.cpp	(working copy)
> > > @@ -4215,9 +4319,26 @@
> > >        // If we are VDUPing a value that comes directly from a
> > vector,
> > > that will
> > >        // cause an unnecessary move to and from a GPR, where
> instead
> > we
> > > could
> > >        // just use VDUPLANE.
> > > -      if (Value->getOpcode() == ISD::EXTRACT_VECTOR_ELT)
> > > -        N = DAG.getNode(ARMISD::VDUPLANE, dl, VT,
> > > +      if (Value->getOpcode() == ISD::EXTRACT_VECTOR_ELT) {
> > > +        // We need to create a new undef vector to use for the
> > > VDUPLANE if the
> > > +        // size of the vector from which we get the value is
> > different
> > > than the
> > > +        // size of the vector that we need to create. We will
> insert
> > > the element
> > > +        // such that we won't produce unnecessary copies.
> > >
> > > Extra space at the end of the sentence.
> > >
> > > +        if (VT != Value->getOperand(0).getValueType()) {
> > > +          ConstantSDNode *constIndex;
> > > +          constIndex = dyn_cast<ConstantSDNode>(Value-
> > >getOperand(1));
> > > +          assert(constIndex && "The index is not a constant!");
> > > +          unsigned index = constIndex-
> > > >getAPIntValue().getLimitedValue() %
> > > +                             VT.getVectorNumElements();
> > > +          N =  DAG.getNode(ARMISD::VDUPLANE, dl, VT,
> > > +                 DAG.getNode(ISD::INSERT_VECTOR_ELT, dl, VT,
> > > DAG.getUNDEF(VT),
> > >
> > > Extra space at the end of the line above.
> > >
> > > +                        Value, DAG.getConstant(index, MVT::i32)),
> > > +                           DAG.getConstant(index, MVT::i32));
> > > +        } else {
> > > +          N = DAG.getNode(ARMISD::VDUPLANE, dl, VT,
> > >                          Value->getOperand(0), Value-
> >getOperand(1));
> > > +        }
> > > +      }
> > >        else
> > >          N = DAG.getNode(ARMISD::VDUP, dl, VT, Value);
> > >
> > >
> > > -- Qualcomm Innovation Center, Inc. is a member of Code Aurora
> Forum,
> > > hosted by The Linux Foundation
> > >
> > >
> > >