[llvm] r242296 - [PPC64LE] Fix vec_sld semantics for little endian

Wed Jul 15 09:35:16 PDT 2015

On Wed, 2015-07-15 at 11:27 -0500, Hal Finkel wrote:
> ----- Original Message -----
> > From: "Bill Schmidt" <wschmidt at linux.vnet.ibm.com>
> > To: "Hal Finkel" <hfinkel at anl.gov>
> > Cc: llvm-commits at cs.uiuc.edu
> > Sent: Wednesday, July 15, 2015 11:23:22 AM
> > Subject: Re: [llvm] r242296 - [PPC64LE] Fix vec_sld semantics for little endian
> > 
> > On Wed, 2015-07-15 at 11:11 -0500, Hal Finkel wrote:
> > > ----- Original Message -----
> > > > From: "Bill Schmidt" <wschmidt at linux.vnet.ibm.com>
> > > > To: "Hal Finkel" <hfinkel at anl.gov>
> > > > Cc: llvm-commits at cs.uiuc.edu
> > > > Sent: Wednesday, July 15, 2015 11:04:39 AM
> > > > Subject: Re: [llvm] r242296 - [PPC64LE] Fix vec_sld semantics for
> > > > little endian
> > > > 
> > > > On Wed, 2015-07-15 at 10:58 -0500, Hal Finkel wrote:
> > > > > Test case?
> > > > > 
> > > > >  -Hal
> > > > 
> > > > I forgot to mention this.  We have existing tests that just check
> > > > for
> > > > a
> > > > vperm that gets generated for vec_sld.  That is still the case.
> > > >  I am
> > > > having trouble figuring out how to create a test case that gets
> > > > more
> > > > information, primarily because I can't get clang -cc1 to stop at
> > > > that
> > > > point rather than moving on and converting the vperm into a
> > > > shufflevector.  I'd be interested in any help with this.
> > > 
> > > I don't understand. If clang produces shufflevector instructions,
> > > then you can check, using an llc test, that those shufflevectors
> > > lower to the correct assembly. That would cover the backend part
> > > of the logic.
> > 
> > The problem I'm having is with the test infrastructure.
> > 
> > The existing builtins-ppc-altivec.c test uses %clang_cc1 and shows
> > the
> > generation of vperm intrinsics for vec_sld:
> > 
> > // RUN: %clang_cc1 -faltivec -triple powerpc64le-unknown-unknown
> > -emit-llvm %s \
> > -o - | FileCheck %s -check-prefix=CHECK-LE
> > 
> >   /* vec_sld */
> >   res_vsc = vec_sld(vsc, vsc, 0);
> > // CHECK: @llvm.ppc.altivec.vperm
> > // CHECK-LE: @llvm.ppc.altivec.vperm
> > 
> > These tests pass, but are not sufficient to discriminate between the
> > vec_perm being generated for big-endian and the vec_perm being
> > generated
> > for little-endian.  So I would like to make the test better.
> > 
> > However, if I try to build with clang, or with clang -cc1, and use -S
> > -emit-llvm, I don't get this output.  Instead, the compilation
> > proceeds
> > far enough to convert the vperm intrinsic into a shufflevector:
> > 
> > ; Function Attrs: nounwind readnone
> > define <16 x i8> @foo_sc(<16 x i8> %x, <16 x i8> %y) #0 {
> > entry:
> >   %0 = shufflevector <16 x i8> %y, <16 x i8> %x, <16 x i32> <i32 15,
> >   i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23,
> >   i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30>
> >   ret <16 x i8> %0
> > }
> > 
> > I can't figure out what the magic foo is that is hidden by %clang_cc
> > and
> > is different from clang -cc, which will allow me to see the actual
> > generated IR that still contains the vperm intrinsics, so I can write
> > a
> > better test case.  That's why I am looking for some help here.  Do
> > you
> > know what the %clang_cc macro is doing?
> > 
> > I had meant to ask you this offline before committing the patch, but
> > I'm
> > juggling too many things and forgot.  Sorry!
> 
> No problem. I assume the conversion is happening by the logic in lib/Transforms/InstCombine/InstCombineCalls.cpp -- if you want to short-circuit this, then either run with -O0, or you can pass -mllvm -disable-llvm-optzns to get the raw Clang output from -S -emit-llvm
> 
> Does that help?

Ah, yes.  -O0 was indeed sufficient to get the version with the
intrinsics back.  Thanks!  That was driving me crazy.  I should have
thought of that, but alas.

Bill

> 
>  -Hal
> 
> > 
> > Thanks for any assistance!
> > 
> > Bill
> > 
> > 
> > 
> > > 
> > >  -Hal
> > > 
> > > > 
> > > > Thanks,
> > > > Bill
> > > > 
> > > > > 
> > > > > ----- Original Message -----
> > > > > > From: "Bill Schmidt" <wschmidt at linux.vnet.ibm.com>
> > > > > > To: llvm-commits at cs.uiuc.edu
> > > > > > Sent: Wednesday, July 15, 2015 10:45:31 AM
> > > > > > Subject: [llvm] r242296 - [PPC64LE] Fix vec_sld semantics for
> > > > > > little endian
> > > > > > 
> > > > > > Author: wschmidt
> > > > > > Date: Wed Jul 15 10:45:30 2015
> > > > > > New Revision: 242296
> > > > > > 
> > > > > > URL: http://llvm.org/viewvc/llvm-project?rev=242296&view=rev
> > > > > > Log:
> > > > > > [PPC64LE] Fix vec_sld semantics for little endian
> > > > > > 
> > > > > > The vec_sld interface provides access to the vsldoi
> > > > > > instruction.
> > > > > > Unlike most of the vec_* interfaces, we do not attempt to
> > > > > > change
> > > > > > the
> > > > > > generated code for vec_sld based on the endian mode.  It is
> > > > > > too
> > > > > > difficult to correctly infer the desired semantics because of
> > > > > > different element types, and the corrected instruction
> > > > > > sequence
> > > > > > is
> > > > > > expensive, involving loading a permute control vector and
> > > > > > performing
> > > > > > a
> > > > > > generalized permute.
> > > > > > 
> > > > > > For GCC, this was implemented as "Don't touch the vec_sld"
> > > > > > implementation.  When it came time for the LLVM
> > > > > > implementation, I
> > > > > > did
> > > > > > the same thing.  However, this was hasty and incorrect.  In
> > > > > > LLVM's
> > > > > > version of altivec.h, vec_sld was previously defined in terms
> > > > > > of
> > > > > > the
> > > > > > vec_perm interface.  Because vec_perm semantics are adjusted
> > > > > > for
> > > > > > little endian, this means that leaving vec_sld untouched
> > > > > > causes
> > > > > > it to
> > > > > > generate something different for LE than for BE.  Not good.
> > > > > > 
> > > > > > This back-end patch accompanies the changes to altivec.h that
> > > > > > change
> > > > > > vec_sld's behavior for little endian.  Those changes mean
> > > > > > that we
> > > > > > see
> > > > > > slightly different code in the back end when trying to
> > > > > > recognize
> > > > > > a
> > > > > > VSLDOI instruction in isVSLDOIShuffleMask.  In particular, a
> > > > > > ShuffleKind of 1 (where the two inputs are identical) must
> > > > > > now be
> > > > > > treated the same way as a ShuffleKind of 2 (little endian
> > > > > > with
> > > > > > different inputs) when little endian mode is in force.  This
> > > > > > is
> > > > > > because ShuffleKind of 1 is defined using big-endian
> > > > > > numbering.
> > > > > > 
> > > > > > This has a ripple effect on LowerBUILD_VECTOR, where we
> > > > > > create
> > > > > > our
> > > > > > own
> > > > > > internal VSLDOI instructions.  Because these are a
> > > > > > ShuffleKind of
> > > > > > 1,
> > > > > > they will now have their shift amounts subtracted from 16
> > > > > > when
> > > > > > recognizing the shuffle mask.  To avoid problems we have to
> > > > > > subtract
> > > > > > them from 16 again before creating the VSLDOI instructions.
> > > > > > 
> > > > > > There are a couple of other uses of BuildVSLDOI, but these do
> > > > > > not
> > > > > > need
> > > > > > to be modified because the shift amount is 8, which is
> > > > > > unchanged
> > > > > > when
> > > > > > subtracted from 16.
> > > > > > 
> > > > > > Modified:
> > > > > >     llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
> > > > > > 
> > > > > > Modified: llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
> > > > > > URL:
> > > > > > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp?rev=242296&r1=242295&r2=242296&view=diff
> > > > > > ==============================================================================
> > > > > > --- llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
> > > > > > (original)
> > > > > > +++ llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp Wed Jul
> > > > > > 15
> > > > > > 10:45:30 2015
> > > > > > @@ -1416,7 +1416,7 @@ int PPC::isVSLDOIShuffleMask(SDNode *N,
> > > > > >    } else
> > > > > >      return -1;
> > > > > >  
> > > > > > -  if (ShuffleKind == 2 && isLE)
> > > > > > +  if (isLE)
> > > > > >      ShiftAmt = 16 - ShiftAmt;
> > > > > >  
> > > > > >    return ShiftAmt;
> > > > > > @@ -7011,17 +7011,20 @@ SDValue
> > > > > > PPCTargetLowering::LowerBUILD_VE
> > > > > >      // t = vsplti c, result = vsldoi t, t, 1
> > > > > >      if (SextVal == (int)(((unsigned)i << 8) | (i < 0 ? 0xFF
> > > > > >      :
> > > > > >      0))) {
> > > > > >        SDValue T = BuildSplatI(i, SplatSize, MVT::v16i8, DAG,
> > > > > >        dl);
> > > > > > -      return BuildVSLDOI(T, T, 1, Op.getValueType(), DAG,
> > > > > > dl);
> > > > > > +      unsigned Amt = Subtarget.isLittleEndian() ? 15 : 1;
> > > > > > +      return BuildVSLDOI(T, T, Amt, Op.getValueType(), DAG,
> > > > > > dl);
> > > > > >      }
> > > > > >      // t = vsplti c, result = vsldoi t, t, 2
> > > > > >      if (SextVal == (int)(((unsigned)i << 16) | (i < 0 ?
> > > > > >      0xFFFF :
> > > > > >      0))) {
> > > > > >        SDValue T = BuildSplatI(i, SplatSize, MVT::v16i8, DAG,
> > > > > >        dl);
> > > > > > -      return BuildVSLDOI(T, T, 2, Op.getValueType(), DAG,
> > > > > > dl);
> > > > > > +      unsigned Amt = Subtarget.isLittleEndian() ? 14 : 2;
> > > > > > +      return BuildVSLDOI(T, T, Amt, Op.getValueType(), DAG,
> > > > > > dl);
> > > > > >      }
> > > > > >      // t = vsplti c, result = vsldoi t, t, 3
> > > > > >      if (SextVal == (int)(((unsigned)i << 24) | (i < 0 ?
> > > > > >      0xFFFFFF
> > > > > >      :
> > > > > >      0))) {
> > > > > >        SDValue T = BuildSplatI(i, SplatSize, MVT::v16i8, DAG,
> > > > > >        dl);
> > > > > > -      return BuildVSLDOI(T, T, 3, Op.getValueType(), DAG,
> > > > > > dl);
> > > > > > +      unsigned Amt = Subtarget.isLittleEndian() ? 13 : 3;
> > > > > > +      return BuildVSLDOI(T, T, Amt, Op.getValueType(), DAG,
> > > > > > dl);
> > > > > >      }
> > > > > >    }
> > > > > >  
> > > > > > 
> > > > > > 
> > > > > > _______________________________________________
> > > > > > llvm-commits mailing list
> > > > > > llvm-commits at cs.uiuc.edu
> > > > > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> > > > > > 
> > > > > 
> > > > 
> > > > 
> > > > 
> > > 
> > 
> > 
> > 
>