[llvm] r190636 - Fix PPC ABI for ByVal structs with vector members

Hal Finkel hfinkel at anl.gov
Fri Feb 27 13:31:11 PST 2015


----- Original Message -----
> From: "David Blaikie" <dblaikie at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: llvm-commits at cs.uiuc.edu
> Sent: Friday, February 27, 2015 3:29:20 PM
> Subject: Re: [llvm] r190636 - Fix PPC ABI for ByVal structs with vector members
> 
> 
> On Thu, Feb 19, 2015 at 4:24 PM, Hal Finkel < hfinkel at anl.gov >
> wrote:
> 
> ----- Original Message -----
> > From: "David Blaikie" < dblaikie at gmail.com >
> > To: "Hal Finkel" < hfinkel at anl.gov >
> > Cc: llvm-commits at cs.uiuc.edu
> > Sent: Thursday, February 19, 2015 3:57:13 PM
> > Subject: Re: [llvm] r190636 - Fix PPC ABI for ByVal structs with
> > vector members
> > 
> > 
> > On Thu, Sep 12, 2013 at 4:20 PM, Hal Finkel < hfinkel at anl.gov >
> > wrote:
> > 
> > 
> > Author: hfinkel
> > Date: Thu Sep 12 18:20:06 2013
> > New Revision: 190636
> > 
> > URL: http://llvm.org/viewvc/llvm-project?rev=190636&view=rev
> > Log:
> > Fix PPC ABI for ByVal structs with vector members
> > 
> > When a structure is passed by value, and that structure contains a
> > vector
> > member, according to the PPC ABI, the structure will receive
> > enhanced
> > alignment
> > (so that the vector within the structure will always be aligned).
> > 
> > This should resolve PR16641.
> > 
> > 
> > In an effort to simplify byval in a future world of opaque
> > (untyped)
> > pointers, I was considering changing "%struct.foo byval" to "ptr
> > byval(num_bytes)" - which I think would work for every target
> > except
> > PPC (PPC, and the code in this commit, is the only case I could
> > find
> > of overriding getByValTypeAlignment that wasn't just the default of
> > asking DataLayout for the type's alignment).
> > 
> > Is this code necessary? If I remove the entire PPCTargetLowering
> > override of getByValTypeAlignment this test case (& everything else
> > in check-llvm) appears to still pass.
> > 
> 
> Interesting. I'll need to look at this in more detail. I'm fairly
> certain that it was needed when it was added. It may also still be
> necessary for QPX support (which is incomplete in tree at the
> moment, but that should change in the next few days). In short,
> please give me a few days, and then I should have a better idea.
> Theoretically, however, I'm pretty sure it is necessary.
> 
> 
> Ping - any further thoughts on this? Does Clang already put the
> necessary alignment on byval attributes for PPC? Do we need to fix
> Clang to do so? (once we have this in the frontend, if possible,
> then I'll set about changing byval to just take a number of bytes to
> copy & ignore the type entirely - except when upgrading old bitcode)
> 
> 

I think that Clang is okay; I'll confirm in a couple of hours.

 -Hal

> 
> -Hal
> 
> 
> 
> > 
> > 
> > Added:
> > llvm/trunk/test/CodeGen/PowerPC/vec-abi-align.ll
> > Modified:
> > llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
> > 
> > Modified: llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp?rev=190636&r1=190635&r2=190636&view=diff
> > ==============================================================================
> > --- llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp (original)
> > +++ llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp Thu Sep 12
> > 18:20:06 2013
> > @@ -578,24 +578,48 @@ PPCTargetLowering::PPCTargetLowering(PPC
> > }
> > }
> > 
> > +/// getMaxByValAlign - Helper for getByValTypeAlignment to
> > determine
> > +/// the desired ByVal argument alignment.
> > +static void getMaxByValAlign(Type *Ty, unsigned &MaxAlign,
> > + unsigned MaxMaxAlign) {
> > + if (MaxAlign == MaxMaxAlign)
> > + return;
> > + if (VectorType *VTy = dyn_cast<VectorType>(Ty)) {
> > + if (MaxMaxAlign >= 32 && VTy->getBitWidth() >= 256)
> > + MaxAlign = 32;
> > + else if (VTy->getBitWidth() >= 128 && MaxAlign < 16)
> > + MaxAlign = 16;
> > + } else if (ArrayType *ATy = dyn_cast<ArrayType>(Ty)) {
> > + unsigned EltAlign = 0;
> > + getMaxByValAlign(ATy->getElementType(), EltAlign, MaxMaxAlign);
> > + if (EltAlign > MaxAlign)
> > + MaxAlign = EltAlign;
> > + } else if (StructType *STy = dyn_cast<StructType>(Ty)) {
> > + for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) {
> > + unsigned EltAlign = 0;
> > + getMaxByValAlign(STy->getElementType(i), EltAlign, MaxMaxAlign);
> > + if (EltAlign > MaxAlign)
> > + MaxAlign = EltAlign;
> > + if (MaxAlign == MaxMaxAlign)
> > + break;
> > + }
> > + }
> > +}
> > +
> > /// getByValTypeAlignment - Return the desired alignment for ByVal
> > aggregate
> > /// function arguments in the caller parameter area.
> > unsigned PPCTargetLowering::getByValTypeAlignment(Type *Ty) const {
> > const TargetMachine &TM = getTargetMachine();
> > // Darwin passes everything on 4 byte boundary.
> > - if (TM.getSubtarget<PPCSubtarget>().isDarwin())
> > + if (PPCSubTarget.isDarwin())
> > return 4;
> > 
> > // 16byte and wider vectors are passed on 16byte boundary.
> > - if (VectorType *VTy = dyn_cast<VectorType>(Ty))
> > - if (VTy->getBitWidth() >= 128)
> > - return 16;
> > -
> > // The rest is 8 on PPC64 and 4 on PPC32 boundary.
> > - if (PPCSubTarget.isPPC64())
> > - return 8;
> > -
> > - return 4;
> > + unsigned Align = PPCSubTarget.isPPC64() ? 8 : 4;
> > + if (PPCSubTarget.hasAltivec() || PPCSubTarget.hasQPX())
> > + getMaxByValAlign(Ty, Align, PPCSubTarget.hasQPX() ? 32 : 16);
> > + return Align;
> > }
> > 
> > const char *PPCTargetLowering::getTargetNodeName(unsigned Opcode)
> > const {
> > @@ -2281,6 +2305,13 @@ PPCTargetLowering::LowerFormalArguments_
> > InVals.push_back(FIN);
> > continue;
> > }
> > +
> > + unsigned BVAlign = Flags.getByValAlign();
> > + if (BVAlign > 8) {
> > + ArgOffset = ((ArgOffset+BVAlign-1)/BVAlign)*BVAlign;
> > + CurArgOffset = ArgOffset;
> > + }
> > +
> > // All aggregates smaller than 8 bytes must be passed
> > right-justified.
> > if (ObjSize < PtrByteSize)
> > CurArgOffset = CurArgOffset + (PtrByteSize - ObjSize);
> > @@ -3870,6 +3901,15 @@ PPCTargetLowering::LowerCall_64SVR4(SDVa
> > if (Size == 0)
> > continue;
> > 
> > + unsigned BVAlign = Flags.getByValAlign();
> > + if (BVAlign > 8) {
> > + if (BVAlign % PtrByteSize != 0)
> > + llvm_unreachable(
> > + "ByVal alignment is not a multiple of the pointer size");
> > +
> > + ArgOffset = ((ArgOffset+BVAlign-1)/BVAlign)*BVAlign;
> > + }
> > +
> > // All aggregates smaller than 8 bytes must be passed
> > right-justified.
> > if (Size==1 || Size==2 || Size==4) {
> > EVT VT = (Size==1) ? MVT::i8 : ((Size==2) ? MVT::i16 : MVT::i32);
> > 
> > Added: llvm/trunk/test/CodeGen/PowerPC/vec-abi-align.ll
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/vec-abi-align.ll?rev=190636&view=auto
> > ==============================================================================
> > --- llvm/trunk/test/CodeGen/PowerPC/vec-abi-align.ll (added)
> > +++ llvm/trunk/test/CodeGen/PowerPC/vec-abi-align.ll Thu Sep 12
> > 18:20:06 2013
> > @@ -0,0 +1,64 @@
> > +; RUN: llc -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr7 < %s |
> > FileCheck %s
> > +target datalayout =
> > "E-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f128:128:128-v128:128:128-n32:64"
> > +target triple = "powerpc64-unknown-linux-gnu"
> > +
> > +%struct.s2 = type { i64, <4 x float> }
> > +
> > + at ve = external global <4 x float>
> > + at n = external global i64
> > +
> > +; Function Attrs: nounwind
> > +define void @test1(i64 %d1, i64 %d2, i64 %d3, i64 %d4, i64 %d5,
> > i64
> > %d6, i64 %d7, i64 %d8, i64 %d9, <4 x float> inreg %vs.coerce) #0 {
> > +entry:
> > + store <4 x float> %vs.coerce, <4 x float>* @ve, align 16, !tbaa
> > !0
> > + ret void
> > +
> > +; CHECK-LABEL: @test1
> > +; CHECK: stvx 2,
> > +; CHECK: blr
> > +}
> > +
> > +; Function Attrs: nounwind
> > +define void @test2(i64 %d1, i64 %d2, i64 %d3, i64 %d4, i64 %d5,
> > i64
> > %d6, i64 %d7, i64 %d8, %struct.s2* byval nocapture readonly %vs) #0
> > {
> > +entry:
> > + %m = getelementptr inbounds %struct.s2* %vs, i64 0, i32 0
> > + %0 = load i64* %m, align 8, !tbaa !2
> > + store i64 %0, i64* @n, align 8, !tbaa !2
> > + %v = getelementptr inbounds %struct.s2* %vs, i64 0, i32 1
> > + %1 = load <4 x float>* %v, align 16, !tbaa !0
> > + store <4 x float> %1, <4 x float>* @ve, align 16, !tbaa !0
> > + ret void
> > +
> > +; CHECK-LABEL: @test2
> > +; CHECK: ld {{[0-9]+}}, 112(1)
> > +; CHECK: li [[REG16:[0-9]+]], 16
> > +; CHECK: addi [[REGB:[0-9]+]], 1, 112
> > +; CHECK: lvx 2, [[REGB]], [[REG16]]
> > +; CHECK: blr
> > +}
> > +
> > +; Function Attrs: nounwind
> > +define void @test3(i64 %d1, i64 %d2, i64 %d3, i64 %d4, i64 %d5,
> > i64
> > %d6, i64 %d7, i64 %d8, i64 %d9, %struct.s2* byval nocapture
> > readonly
> > %vs) #0 {
> > +entry:
> > + %m = getelementptr inbounds %struct.s2* %vs, i64 0, i32 0
> > + %0 = load i64* %m, align 8, !tbaa !2
> > + store i64 %0, i64* @n, align 8, !tbaa !2
> > + %v = getelementptr inbounds %struct.s2* %vs, i64 0, i32 1
> > + %1 = load <4 x float>* %v, align 16, !tbaa !0
> > + store <4 x float> %1, <4 x float>* @ve, align 16, !tbaa !0
> > + ret void
> > +
> > +; CHECK-LABEL: @test3
> > +; CHECK: ld {{[0-9]+}}, 128(1)
> > +; CHECK: li [[REG16:[0-9]+]], 16
> > +; CHECK: addi [[REGB:[0-9]+]], 1, 128
> > +; CHECK: lvx 2, [[REGB]], [[REG16]]
> > +; CHECK: blr
> > +}
> > +
> > +attributes #0 = { nounwind }
> > +
> > +!0 = metadata !{metadata !"omnipotent char", metadata !1}
> > +!1 = metadata !{metadata !"Simple C/C++ TBAA"}
> > +!2 = metadata !{metadata !"long", metadata !0}
> > +
> > 
> > 
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> > 
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory



More information about the llvm-commits mailing list