[llvm-commits] [PATCH, RFC] Fix PowerPC calling convention for saving nonvolatile condition register fields
Hal Finkel
hfinkel at anl.gov
Tue Sep 11 14:46:46 PDT 2012
On Tue, 11 Sep 2012 23:34:51 +0200
Roman Divacky <rdivacky at freebsd.org> wrote:
> LGTM.
>
> This fixes PR13708 and partially PR13623. It also lets us enable
> exceptions on PPC64 (that are currently disabled because we mishandle
> CR2).
>
> It survives selfhost and passes all tests. On Linux even all EH tests
> from libcxxrt.
>
> Hal, if you dont have anything against it I would like to commit this.
> Along with the exception enable.
Due diligence having been performed, please commit! :)
-Hal
>
> Thank you, Roman
>
> On Tue, Sep 11, 2012 at 02:55:01PM -0500, William J. Schmidt wrote:
> > I received some offline comments from Roman. Here's a new version
> > that addresses those, as well as fixing a bug encountered with the
> > revised test cases. Summary of the changes:
> >
> > * The four test cases are merged into one file now.
> >
> > * CRSpillFrameIdx was changed from a static variable to a mutable
> > class variable so that reinitialization for each function can
> > properly occur.
> >
> > * Removed bogus early return from spillCalleeSavedRegisters.
> >
> > * Minor stylistic clean-ups and commentary.
> >
> > Thanks,
> > Bill
> >
> > On Tue, 2012-09-11 at 09:46 -0500, William J. Schmidt wrote:
> > > This patch corrects logic in PPCFrameLowering for save and
> > > restore of nonvolatile condition register fields across calls
> > > under the SVR4 ABIs.
> > >
> > > * With the 64-bit ABI, the save location is at a fixed offset of
> > > 8 from the stack pointer. The frame pointer cannot be used to
> > > access this portion of the stack frame since the distance from
> > > the frame pointer may change with alloca calls.
> > >
> > > * With the 32-bit ABI, the save location is just below the
> > > general register save area, and is accessed via the frame pointer
> > > like the rest of the save areas. This is an optional slot, so it
> > > must only be created if any of CR2, CR3, and CR4 were modified.
> > >
> > > * For both ABIs, save/restore logic is generated only if one of
> > > the nonvolatile CR fields were modified.
> > >
> > > I added tests for both ABIs that demonstrate the code now works
> > > correctly. The extra call to "foo" in the tests is there to force
> > > creation of a stack frame.
> > >
> > > I also took this opportunity to clean up an extra FIXME in
> > > PPCFrameLowering.h. Save area offsets for 32-bit GPRs are
> > > meaningless for the 64-bit ABI, so I removed them for correctness
> > > and efficiency.
> > >
> > > I haven't yet requested commit authority, so I would appreciate
> > > it if someone could please commit these changes following the
> > > review period.
> > >
> > > Thanks!
> > > Bill
> > >
>
> > Index: test/CodeGen/PowerPC/crsave.ll
> > ===================================================================
> > --- test/CodeGen/PowerPC/crsave.ll (revision 0)
> > +++ test/CodeGen/PowerPC/crsave.ll (revision 0)
> > @@ -0,0 +1,49 @@
> > +; RUN: llc -O0 -disable-fp-elim -mtriple=powerpc-unknown-linux-gnu
> > < %s | FileCheck %s -check-prefix=PPC32 +; RUN: llc -O0
> > -disable-fp-elim -mtriple=powerpc64-unknown-linux-gnu < %s |
> > FileCheck %s -check-prefix=PPC64 + +declare void @foo()
> > +
> > +define i32 @test_cr2() nounwind {
> > +entry:
> > + %ret = alloca i32, align 4
> > + %0 = call i32 asm sideeffect "\0A\09mtcr $4\0A\09cmp
> > 2,$2,$1\0A\09mfcr $0", "=r,r,r,r,r,~{cr2}"(i32 1, i32 2, i32 3, i32
> > 0) nounwind
> > + store i32 %0, i32* %ret, align 4
> > + call void @foo()
> > + %1 = load i32* %ret, align 4
> > + ret i32 %1
> > +}
> > +
> > +; PPC32: mfcr 12
> > +; PPC32-NEXT: stw 12, {{[0-9]+}}(31)
> > +; PPC32: lwz 12, {{[0-9]+}}(31)
> > +; PPC32-NEXT: mtcrf 32, 12
> > +
> > +; PPC64: mfcr 12
> > +; PPC64-NEXT: stw 12, 8(1)
> > +; PPC64: lwz 12, 8(1)
> > +; PPC64-NEXT: mtcrf 32, 12
> > +
> > +define i32 @test_cr234() nounwind {
> > +entry:
> > + %ret = alloca i32, align 4
> > + %0 = call i32 asm sideeffect "\0A\09mtcr $4\0A\09cmp
> > 2,$2,$1\0A\09cmp 3,$2,$2\0A\09cmp 4,$2,$3\0A\09mfcr $0",
> > "=r,r,r,r,r,~{cr2},~{cr3},~{cr4}"(i32 1, i32 2, i32 3, i32 0)
> > nounwind
> > + store i32 %0, i32* %ret, align 4
> > + call void @foo()
> > + %1 = load i32* %ret, align 4
> > + ret i32 %1
> > +}
> > +
> > +; PPC32: mfcr 12
> > +; PPC32-NEXT: stw 12, {{[0-9]+}}(31)
> > +; PPC32: lwz 12, {{[0-9]+}}(31)
> > +; PPC32-NEXT: mtcrf 32, 12
> > +; PPC32-NEXT: mtcrf 16, 12
> > +; PPC32-NEXT: mtcrf 8, 12
> > +
> > +; PPC64: mfcr 12
> > +; PPC64-NEXT: stw 12, 8(1)
> > +; PPC64: lwz 12, 8(1)
> > +; PPC64-NEXT: mtcrf 32, 12
> > +; PPC64-NEXT: mtcrf 16, 12
> > +; PPC64-NEXT: mtcrf 8, 12
> > +
> > Index: lib/Target/PowerPC/PPCRegisterInfo.h
> > ===================================================================
> > --- lib/Target/PowerPC/PPCRegisterInfo.h (revision 163540)
> > +++ lib/Target/PowerPC/PPCRegisterInfo.h (working copy)
> > @@ -30,6 +30,7 @@ class PPCRegisterInfo : public PPCGenRegisterInfo
> > std::map<unsigned, unsigned> ImmToIdxMap;
> > const PPCSubtarget &Subtarget;
> > const TargetInstrInfo &TII;
> > + mutable int CRSpillFrameIdx;
> > public:
> > PPCRegisterInfo(const PPCSubtarget &SubTarget, const
> > TargetInstrInfo &tii);
> > @@ -65,6 +66,8 @@ class PPCRegisterInfo : public PPCGenRegisterInfo
> > int SPAdj, RegScavenger *RS) const;
> > void lowerCRRestore(MachineBasicBlock::iterator II, unsigned
> > FrameIndex, int SPAdj, RegScavenger *RS) const;
> > + bool hasReservedSpillSlot(const MachineFunction &MF, unsigned
> > Reg,
> > + int &FrameIdx) const;
> > void eliminateFrameIndex(MachineBasicBlock::iterator II,
> > int SPAdj, RegScavenger *RS = NULL)
> > const;
> > Index: lib/Target/PowerPC/PPCFrameLowering.h
> > ===================================================================
> > --- lib/Target/PowerPC/PPCFrameLowering.h (revision 163540)
> > +++ lib/Target/PowerPC/PPCFrameLowering.h (working copy)
> > @@ -45,6 +45,16 @@ class PPCFrameLowering: public
> > TargetFrameLowering RegScavenger *RS = NULL) const;
> > void processFunctionBeforeFrameFinalized(MachineFunction &MF)
> > const;
> > + bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,
> > + MachineBasicBlock::iterator MI,
> > + const
> > std::vector<CalleeSavedInfo> &CSI,
> > + const TargetRegisterInfo *TRI)
> > const; +
> > + bool restoreCalleeSavedRegisters(MachineBasicBlock &MBB,
> > + MachineBasicBlock::iterator MI,
> > + const
> > std::vector<CalleeSavedInfo> &CSI,
> > + const TargetRegisterInfo *TRI)
> > const; +
> > /// targetHandlesStackFrameRounding - Returns true if the target
> > is /// responsible for rounding up the stack frame (probably at
> > emitPrologue /// time).
> > @@ -170,23 +180,11 @@ class PPCFrameLowering: public
> > TargetFrameLowering {PPC::R15, -68},
> > {PPC::R14, -72},
> >
> > - // CR save area offset.
> > - // FIXME SVR4: Disable CR save area for now.
> > -// {PPC::CR2, -4},
> > -// {PPC::CR3, -4},
> > -// {PPC::CR4, -4},
> > -// {PPC::CR2LT, -4},
> > -// {PPC::CR2GT, -4},
> > -// {PPC::CR2EQ, -4},
> > -// {PPC::CR2UN, -4},
> > -// {PPC::CR3LT, -4},
> > -// {PPC::CR3GT, -4},
> > -// {PPC::CR3EQ, -4},
> > -// {PPC::CR3UN, -4},
> > -// {PPC::CR4LT, -4},
> > -// {PPC::CR4GT, -4},
> > -// {PPC::CR4EQ, -4},
> > -// {PPC::CR4UN, -4},
> > + // CR save area offset. We map each of the nonvolatile CR
> > fields
> > + // to the slot for CR2, which is the first of the
> > nonvolatile CR
> > + // fields to be assigned, so that we only allocate one save
> > slot.
> > + // See PPCRegisterInfo::hasReservedSpillSlot() for more
> > information.
> > + {PPC::CR2, -4},
> >
> > // VRSAVE save area offset.
> > {PPC::VRSAVE, -4},
> > @@ -228,27 +226,6 @@ class PPCFrameLowering: public
> > TargetFrameLowering {PPC::F14, -144},
> >
> > // General register save area offsets.
> > - // FIXME 64-bit SVR4: Are 32-bit registers actually
> > allocated in 64-bit
> > - // mode?
> > - {PPC::R31, -4},
> > - {PPC::R30, -12},
> > - {PPC::R29, -20},
> > - {PPC::R28, -28},
> > - {PPC::R27, -36},
> > - {PPC::R26, -44},
> > - {PPC::R25, -52},
> > - {PPC::R24, -60},
> > - {PPC::R23, -68},
> > - {PPC::R22, -76},
> > - {PPC::R21, -84},
> > - {PPC::R20, -92},
> > - {PPC::R19, -100},
> > - {PPC::R18, -108},
> > - {PPC::R17, -116},
> > - {PPC::R16, -124},
> > - {PPC::R15, -132},
> > - {PPC::R14, -140},
> > -
> > {PPC::X31, -8},
> > {PPC::X30, -16},
> > {PPC::X29, -24},
> > @@ -268,24 +245,6 @@ class PPCFrameLowering: public
> > TargetFrameLowering {PPC::X15, -136},
> > {PPC::X14, -144},
> >
> > - // CR save area offset.
> > - // FIXME SVR4: Disable CR save area for now.
> > -// {PPC::CR2, -4},
> > -// {PPC::CR3, -4},
> > -// {PPC::CR4, -4},
> > -// {PPC::CR2LT, -4},
> > -// {PPC::CR2GT, -4},
> > -// {PPC::CR2EQ, -4},
> > -// {PPC::CR2UN, -4},
> > -// {PPC::CR3LT, -4},
> > -// {PPC::CR3GT, -4},
> > -// {PPC::CR3EQ, -4},
> > -// {PPC::CR3UN, -4},
> > -// {PPC::CR4LT, -4},
> > -// {PPC::CR4GT, -4},
> > -// {PPC::CR4EQ, -4},
> > -// {PPC::CR4UN, -4},
> > -
> > // VRSAVE save area offset.
> > {PPC::VRSAVE, -4},
> >
> > Index: lib/Target/PowerPC/PPCRegisterInfo.cpp
> > ===================================================================
> > --- lib/Target/PowerPC/PPCRegisterInfo.cpp (revision 163540)
> > +++ lib/Target/PowerPC/PPCRegisterInfo.cpp (working copy)
> > @@ -71,7 +71,7 @@ PPCRegisterInfo::PPCRegisterInfo(const PPCSubtarge
> > : PPCGenRegisterInfo(ST.isPPC64() ? PPC::LR8 : PPC::LR,
> > ST.isPPC64() ? 0 : 1,
> > ST.isPPC64() ? 0 : 1),
> > - Subtarget(ST), TII(tii) {
> > + Subtarget(ST), TII(tii), CRSpillFrameIdx(0) {
> > ImmToIdxMap[PPC::LD] = PPC::LDX; ImmToIdxMap[PPC::STD] =
> > PPC::STDX; ImmToIdxMap[PPC::LBZ] = PPC::LBZX;
> > ImmToIdxMap[PPC::STB] = PPC::STBX; ImmToIdxMap[PPC::LHZ] =
> > PPC::LHZX; ImmToIdxMap[PPC::LHA] = PPC::LHAX; @@ -111,6 +111,11
> > @@ PPCRegisterInfo::getCalleeSavedRegs(const MachineF return
> > Subtarget.isPPC64() ? CSR_Darwin64_SaveList : CSR_Darwin32_SaveList;
> >
> > + // For 32-bit SVR4, also initialize the frame index associated
> > with
> > + // the CR spill slot.
> > + if (!Subtarget.isPPC64())
> > + CRSpillFrameIdx = 0;
> > +
> > return Subtarget.isPPC64() ? CSR_SVR464_SaveList :
> > CSR_SVR432_SaveList; }
> >
> > @@ -477,6 +482,31 @@ void
> > PPCRegisterInfo::lowerCRRestore(MachineBasicB MBB.erase(II);
> > }
> >
> > +bool
> > +PPCRegisterInfo::hasReservedSpillSlot(const MachineFunction &MF,
> > + unsigned Reg, int &FrameIdx)
> > const { +
> > + // For the nonvolatile condition registers (CR2, CR3, CR4) in an
> > SVR4
> > + // ABI, return true to prevent allocating an additional frame
> > slot.
> > + // For 64-bit, the CR save area is at SP+8; the value of
> > FrameIdx = 0
> > + // is arbitrary and will be subsequently ignored. For 32-bit,
> > we must
> > + // create exactly one stack slot and return its FrameIdx for all
> > + // nonvolatiles.
> > + if (Subtarget.isSVR4ABI() && PPC::CR2 <= Reg && Reg <= PPC::CR4)
> > {
> > + if (Subtarget.isPPC64()) {
> > + FrameIdx = 0;
> > + } else if (CRSpillFrameIdx) {
> > + FrameIdx = CRSpillFrameIdx;
> > + } else {
> > + MachineFrameInfo *MFI = ((MachineFunction
> > &)MF).getFrameInfo();
> > + FrameIdx = MFI->CreateFixedObject((uint64_t)4, (int64_t)-4,
> > true);
> > + CRSpillFrameIdx = FrameIdx;
> > + }
> > + return true;
> > + }
> > + return false;
> > +}
> > +
> > void
> > PPCRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator
> > II, int SPAdj, RegScavenger *RS) const {
> > Index: lib/Target/PowerPC/PPCFrameLowering.cpp
> > ===================================================================
> > --- lib/Target/PowerPC/PPCFrameLowering.cpp (revision 163540)
> > +++ lib/Target/PowerPC/PPCFrameLowering.cpp (working copy)
> > @@ -13,6 +13,7 @@
> >
> > #include "PPCFrameLowering.h"
> > #include "PPCInstrInfo.h"
> > +#include "PPCInstrBuilder.h"
> > #include "PPCMachineFunctionInfo.h"
> > #include "llvm/Function.h"
> > #include "llvm/CodeGen/MachineFrameInfo.h"
> > @@ -168,6 +169,11 @@ static void HandleVRSaveUpdate(MachineInstr
> > *MI, c MI->eraseFromParent();
> > }
> >
> > +static bool spillsCR(const MachineFunction &MF) {
> > + const PPCFunctionInfo *FuncInfo = MF.getInfo<PPCFunctionInfo>();
> > + return FuncInfo->isCRSpilled();
> > +}
> > +
> > /// determineFrameLayout - Determine the size of the frame and
> > maximum call /// frame size.
> > void PPCFrameLowering::determineFrameLayout(MachineFunction &MF)
> > const { @@ -184,13 +190,21 @@ void
> > PPCFrameLowering::determineFrameLayout(Machin
> > // If we are a leaf function, and use up to 224 bytes of stack
> > space, // don't have a frame pointer, calls, or dynamic alloca then
> > we do not need
> > - // to adjust the stack pointer (we fit in the Red Zone).
> > + // to adjust the stack pointer (we fit in the Red Zone). For
> > 64-bit
> > + // SVR4, we also require a stack frame if we need to spill the
> > CR,
> > + // since this spill area is addressed relative to the stack
> > pointer. bool DisableRedZone =
> > MF.getFunction()->hasFnAttr(Attribute::NoRedZone);
> > - // FIXME SVR4 The 32-bit SVR4 ABI has no red zone.
> > + // FIXME SVR4 The 32-bit SVR4 ABI has no red zone. However, it
> > can
> > + // still generate stackless code if all local vars are
> > reg-allocated.
> > + // Try: (FrameSize <= 224
> > + // || (FrameSize == 0 && Subtarget.isPPC32 &&
> > Subtarget.isSVR4ABI())) if (!DisableRedZone &&
> > FrameSize <= 224 && // Fits in red
> > zone. !MFI->hasVarSizedObjects() && // No dynamic
> > alloca. !MFI->adjustsStack() && // No calls.
> > + !(Subtarget.isPPC64() && // No 64-bit
> > SVR4 CRsave.
> > + Subtarget.isSVR4ABI()
> > + && spillsCR(MF)) &&
> > (!ALIGN_STACK || MaxAlign <= TargetAlign)) { // No special
> > alignment. // No need for frame
> > MFI->setStackSize(0);
> > @@ -488,7 +502,6 @@ void
> > PPCFrameLowering::emitPrologue(MachineFunctio // Add callee saved
> > registers to move list. const std::vector<CalleeSavedInfo> &CSI =
> > MFI->getCalleeSavedInfo(); for (unsigned I = 0, E = CSI.size();
> > I != E; ++I) {
> > - int Offset = MFI->getObjectOffset(CSI[I].getFrameIdx());
> > unsigned Reg = CSI[I].getReg();
> > if (Reg == PPC::LR || Reg == PPC::LR8 || Reg == PPC::RM)
> > continue;
> > @@ -497,6 +510,25 @@ void
> > PPCFrameLowering::emitPrologue(MachineFunctio if
> > (PPC::CRBITRCRegClass.contains(Reg)) continue;
> >
> > + // For SVR4, don't emit a move for the CR spill slot if we
> > haven't
> > + // spilled CRs.
> > + if (Subtarget.isSVR4ABI()
> > + && (PPC::CR2 <= Reg && Reg <= PPC::CR4)
> > + && !spillsCR(MF))
> > + continue;
> > +
> > + // For 64-bit SVR4 when we have spilled CRs, the spill
> > location
> > + // is SP+8, not a frame-relative slot.
> > + if (Subtarget.isSVR4ABI()
> > + && Subtarget.isPPC64()
> > + && (PPC::CR2 <= Reg && Reg <= PPC::CR4)) {
> > + MachineLocation CSDst(PPC::X1, 8);
> > + MachineLocation CSSrc(PPC::CR2);
> > + Moves.push_back(MachineMove(Label, CSDst, CSSrc));
> > + continue;
> > + }
> > +
> > + int Offset = MFI->getObjectOffset(CSI[I].getFrameIdx());
> > MachineLocation CSDst(MachineLocation::VirtualFP, Offset);
> > MachineLocation CSSrc(Reg);
> > Moves.push_back(MachineMove(Label, CSDst, CSSrc));
> > @@ -714,11 +746,6 @@ void
> > PPCFrameLowering::emitEpilogue(MachineFunctio }
> > }
> >
> > -static bool spillsCR(const MachineFunction &MF) {
> > - const PPCFunctionInfo *FuncInfo = MF.getInfo<PPCFunctionInfo>();
> > - return FuncInfo->isCRSpilled();
> > -}
> > -
> > /// MustSaveLR - Return true if this function requires that we
> > save the LR /// register onto the stack in the prolog and restore
> > it in the epilog of the /// function.
> > @@ -808,7 +835,6 @@ void
> > PPCFrameLowering::processFunctionBeforeFrameF bool HasGPSaveArea =
> > false; bool HasG8SaveArea = false;
> > bool HasFPSaveArea = false;
> > - bool HasCRSaveArea = false;
> > bool HasVRSAVESaveArea = false;
> > bool HasVRSaveArea = false;
> >
> > @@ -843,10 +869,9 @@ void
> > PPCFrameLowering::processFunctionBeforeFrameF if (Reg < MinFPR) {
> > MinFPR = Reg;
> > }
> > -// FIXME SVR4: Disable CR save area for now.
> > } else if (PPC::CRBITRCRegClass.contains(Reg) ||
> > PPC::CRRCRegClass.contains(Reg)) {
> > -// HasCRSaveArea = true;
> > + ; // do nothing, as we already know whether CRs are spilled
> > } else if (PPC::VRSAVERCRegClass.contains(Reg)) {
> > HasVRSAVESaveArea = true;
> > } else if (PPC::VRRCRegClass.contains(Reg)) {
> > @@ -926,16 +951,21 @@ void
> > PPCFrameLowering::processFunctionBeforeFrameF }
> > }
> >
> > - // The CR save area is below the general register save area.
> > - if (HasCRSaveArea) {
> > - // FIXME SVR4: Is it actually possible to have multiple
> > elements in CSI
> > - // which have the CR/CRBIT register class?
> > + // For 32-bit only, the CR save area is below the general
> > register
> > + // save area. For 64-bit SVR4, the CR save area is addressed
> > relative
> > + // to the stack pointer and hence does not need an adjustment
> > here.
> > + // Only CR2 (the first nonvolatile spilled) has an associated
> > frame
> > + // index so that we have a single uniform save area.
> > + if (spillsCR(MF) && !(Subtarget.isPPC64() &&
> > Subtarget.isSVR4ABI())) { // Adjust the frame index of the CR spill
> > slot. for (unsigned i = 0, e = CSI.size(); i != e; ++i) {
> > unsigned Reg = CSI[i].getReg();
> >
> > - if (PPC::CRBITRCRegClass.contains(Reg) ||
> > - PPC::CRRCRegClass.contains(Reg)) {
> > + if ((Subtarget.isSVR4ABI() && Reg == PPC::CR2)
> > + // Leave Darwin logic as-is.
> > + || (!Subtarget.isSVR4ABI() &&
> > + (PPC::CRBITRCRegClass.contains(Reg) ||
> > + PPC::CRRCRegClass.contains(Reg)))) {
> > int FI = CSI[i].getFrameIdx();
> >
> > FFI->setObjectOffset(FI, LowerBound +
> > FFI->getObjectOffset(FI)); @@ -973,3 +1003,184 @@ void
> > PPCFrameLowering::processFunctionBeforeFrameF }
> > }
> > }
> > +
> > +bool
> > +PPCFrameLowering::spillCalleeSavedRegisters(MachineBasicBlock &MBB,
> > + MachineBasicBlock::iterator
> > MI,
> > + const
> > std::vector<CalleeSavedInfo> &CSI,
> > + const TargetRegisterInfo
> > *TRI) const { +
> > + // Currently, this function only handles SVR4 32- and 64-bit
> > ABIs.
> > + // Return false otherwise to maintain pre-existing behavior.
> > + if (!Subtarget.isSVR4ABI())
> > + return false;
> > +
> > + MachineFunction *MF = MBB.getParent();
> > + const PPCInstrInfo &TII =
> > + *static_cast<const
> > PPCInstrInfo*>(MF->getTarget().getInstrInfo());
> > + DebugLoc DL;
> > + bool CRSpilled = false;
> > +
> > + for (unsigned i = 0, e = CSI.size(); i != e; ++i) {
> > + unsigned Reg = CSI[i].getReg();
> > + // CR2 through CR4 are the nonvolatile CR fields.
> > + bool IsCRField = PPC::CR2 <= Reg && Reg <= PPC::CR4;
> > +
> > + if (CRSpilled && IsCRField)
> > + continue;
> > +
> > + // Add the callee-saved register as live-in; it's killed at
> > the spill.
> > + MBB.addLiveIn(Reg);
> > +
> > + // Insert the spill to the stack frame.
> > + if (IsCRField) {
> > + CRSpilled = true;
> > + // The first time we see a CR field, store the whole CR into
> > the
> > + // save slot via GPR12 (available in the prolog for 32- and
> > 64-bit).
> > + if (Subtarget.isPPC64()) {
> > + // 64-bit: SP+8
> > + MBB.insert(MI, BuildMI(*MF, DL, TII.get(PPC::MFCR),
> > PPC::X12));
> > + MBB.insert(MI, BuildMI(*MF, DL, TII.get(PPC::STW))
> > + .addReg(PPC::X12,
> > + getKillRegState(true))
> > + .addImm(8)
> > + .addReg(PPC::X1));
> > + } else {
> > + // 32-bit: FP-relative. Note that we made sure CR2-CR4
> > all have
> > + // the same frame index in
> > PPCRegisterInfo::hasReservedSpillSlot.
> > + MBB.insert(MI, BuildMI(*MF, DL, TII.get(PPC::MFCR),
> > PPC::R12));
> > + MBB.insert(MI, addFrameReference(BuildMI(*MF, DL,
> > TII.get(PPC::STW))
> > + .addReg(PPC::R12,
> > +
> > getKillRegState(true)),
> > + CSI[i].getFrameIdx()));
> > + }
> > +
> > + // Record that we spill the CR in this function.
> > + PPCFunctionInfo *FuncInfo = MF->getInfo<PPCFunctionInfo>();
> > + FuncInfo->setSpillsCR();
> > + } else {
> > + const TargetRegisterClass *RC =
> > TRI->getMinimalPhysRegClass(Reg);
> > + TII.storeRegToStackSlot(MBB, MI, Reg, true,
> > + CSI[i].getFrameIdx(), RC, TRI);
> > + }
> > + }
> > + return true;
> > +}
> > +
> > +static void
> > +restoreCRs(bool isPPC64, bool CR2Spilled, bool CR3Spilled, bool
> > CR4Spilled,
> > + MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,
> > + const std::vector<CalleeSavedInfo> &CSI, unsigned
> > CSIIndex) { +
> > + MachineFunction *MF = MBB.getParent();
> > + const PPCInstrInfo &TII =
> > + *static_cast<const
> > PPCInstrInfo*>(MF->getTarget().getInstrInfo());
> > + DebugLoc DL;
> > + unsigned RestoreOp, MoveReg;
> > +
> > + if (isPPC64) {
> > + // 64-bit: SP+8
> > + MBB.insert(MI, BuildMI(*MF, DL, TII.get(PPC::LWZ), PPC::X12)
> > + .addImm(8)
> > + .addReg(PPC::X1));
> > + RestoreOp = PPC::MTCRF8;
> > + MoveReg = PPC::X12;
> > + } else {
> > + // 32-bit: FP-relative
> > + MBB.insert(MI, addFrameReference(BuildMI(*MF, DL,
> > TII.get(PPC::LWZ),
> > + PPC::R12),
> > + CSI[CSIIndex].getFrameIdx()));
> > + RestoreOp = PPC::MTCRF;
> > + MoveReg = PPC::R12;
> > + }
> > +
> > + if (CR2Spilled)
> > + MBB.insert(MI, BuildMI(*MF, DL, TII.get(RestoreOp), PPC::CR2)
> > + .addReg(MoveReg));
> > +
> > + if (CR3Spilled)
> > + MBB.insert(MI, BuildMI(*MF, DL, TII.get(RestoreOp), PPC::CR3)
> > + .addReg(MoveReg));
> > +
> > + if (CR4Spilled)
> > + MBB.insert(MI, BuildMI(*MF, DL, TII.get(RestoreOp), PPC::CR4)
> > + .addReg(MoveReg));
> > +}
> > +
> > +bool
> > +PPCFrameLowering::restoreCalleeSavedRegisters(MachineBasicBlock
> > &MBB,
> > +
> > MachineBasicBlock::iterator MI,
> > + const
> > std::vector<CalleeSavedInfo> &CSI,
> > + const TargetRegisterInfo
> > *TRI) const { +
> > + // Currently, this function only handles SVR4 32- and 64-bit
> > ABIs.
> > + // Return false otherwise to maintain pre-existing behavior.
> > + if (!Subtarget.isSVR4ABI())
> > + return false;
> > +
> > + MachineFunction *MF = MBB.getParent();
> > + const PPCInstrInfo &TII =
> > + *static_cast<const
> > PPCInstrInfo*>(MF->getTarget().getInstrInfo());
> > + bool CR2Spilled = false;
> > + bool CR3Spilled = false;
> > + bool CR4Spilled = false;
> > + unsigned CSIIndex = 0;
> > +
> > + // Initialize insertion-point logic; we will be restoring in
> > reverse
> > + // order of spill.
> > + MachineBasicBlock::iterator I = MI, BeforeI = I;
> > + bool AtStart = I == MBB.begin();
> > +
> > + if (!AtStart)
> > + --BeforeI;
> > +
> > + for (unsigned i = 0, e = CSI.size(); i != e; ++i) {
> > + unsigned Reg = CSI[i].getReg();
> > +
> > + if (Reg == PPC::CR2) {
> > + CR2Spilled = true;
> > + // The spill slot is associated only with CR2, which is the
> > + // first nonvolatile spilled. Save it here.
> > + CSIIndex = i;
> > + continue;
> > + } else if (Reg == PPC::CR3) {
> > + CR3Spilled = true;
> > + continue;
> > + } else if (Reg == PPC::CR4) {
> > + CR4Spilled = true;
> > + continue;
> > + } else {
> > + // When we first encounter a non-CR register after seeing at
> > + // least one CR register, restore all spilled CRs together.
> > + if ((CR2Spilled || CR3Spilled || CR4Spilled)
> > + && !(PPC::CR2 <= Reg && Reg <= PPC::CR4)) {
> > + restoreCRs(Subtarget.isPPC64(), CR2Spilled, CR3Spilled,
> > CR4Spilled,
> > + MBB, I, CSI, CSIIndex);
> > + CR2Spilled = CR3Spilled = CR4Spilled = false;
> > + }
> > +
> > + // Default behavior for non-CR saves.
> > + const TargetRegisterClass *RC =
> > TRI->getMinimalPhysRegClass(Reg);
> > + TII.loadRegFromStackSlot(MBB, I, Reg, CSI[i].getFrameIdx(),
> > + RC, TRI);
> > + assert(I != MBB.begin() &&
> > + "loadRegFromStackSlot didn't insert any code!");
> > + }
> > +
> > + // Insert in reverse order.
> > + if (AtStart)
> > + I = MBB.begin();
> > + else {
> > + I = BeforeI;
> > + ++I;
> > + }
> > + }
> > +
> > + // If we haven't yet spilled the CRs, do so now.
> > + if (CR2Spilled || CR3Spilled || CR4Spilled)
> > + restoreCRs(Subtarget.isPPC64(), CR2Spilled, CR3Spilled,
> > CR4Spilled,
> > + MBB, I, CSI, CSIIndex);
> > +
> > + return true;
> > +}
> > +
>
--
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-commits
mailing list