[llvm-commits] [PATCH, RFC] Fix PowerPC calling convention for saving nonvolatile condition register fields

Hal Finkel hfinkel at anl.gov
Tue Sep 11 14:46:46 PDT 2012


On Tue, 11 Sep 2012 23:34:51 +0200
Roman Divacky <rdivacky at freebsd.org> wrote:

> LGTM.
> 
> This fixes PR13708 and partially PR13623. It also lets us enable
> exceptions on PPC64 (that are currently disabled because we mishandle
> CR2).
> 
> It survives selfhost and passes all tests. On Linux even all EH tests
> from libcxxrt.
> 
> Hal, if you dont have anything against it I would like to commit this.
> Along with the exception enable.

Due diligence having been performed, please commit! :)

 -Hal

> 
> Thank you, Roman
> 
> On Tue, Sep 11, 2012 at 02:55:01PM -0500, William J. Schmidt wrote:
> > I received some offline comments from Roman.  Here's a new version
> > that addresses those, as well as fixing a bug encountered with the
> > revised test cases.  Summary of the changes:
> > 
> >  * The four test cases are merged into one file now.
> > 
> >  * CRSpillFrameIdx was changed from a static variable to a mutable
> > class variable so that reinitialization for each function can
> > properly occur.
> > 
> >  * Removed bogus early return from spillCalleeSavedRegisters.
> > 
> >  * Minor stylistic clean-ups and commentary.
> > 
> > Thanks,
> > Bill
> > 
> > On Tue, 2012-09-11 at 09:46 -0500, William J. Schmidt wrote:
> > > This patch corrects logic in PPCFrameLowering for save and
> > > restore of nonvolatile condition register fields across calls
> > > under the SVR4 ABIs.
> > > 
> > >  * With the 64-bit ABI, the save location is at a fixed offset of
> > > 8 from the stack pointer.  The frame pointer cannot be used to
> > > access this portion of the stack frame since the distance from
> > > the frame pointer may change with alloca calls.
> > > 
> > >  * With the 32-bit ABI, the save location is just below the
> > > general register save area, and is accessed via the frame pointer
> > > like the rest of the save areas.  This is an optional slot, so it
> > > must only be created if any of CR2, CR3, and CR4 were modified.
> > > 
> > >  * For both ABIs, save/restore logic is generated only if one of
> > > the nonvolatile CR fields were modified.
> > > 
> > > I added tests for both ABIs that demonstrate the code now works
> > > correctly.  The extra call to "foo" in the tests is there to force
> > > creation of a stack frame.
> > > 
> > > I also took this opportunity to clean up an extra FIXME in
> > > PPCFrameLowering.h.  Save area offsets for 32-bit GPRs are
> > > meaningless for the 64-bit ABI, so I removed them for correctness
> > > and efficiency.
> > > 
> > > I haven't yet requested commit authority, so I would appreciate
> > > it if someone could please commit these changes following the
> > > review period.
> > > 
> > > Thanks!
> > > Bill
> > > 
> 
> > Index: test/CodeGen/PowerPC/crsave.ll
> > ===================================================================
> > --- test/CodeGen/PowerPC/crsave.ll	(revision 0)
> > +++ test/CodeGen/PowerPC/crsave.ll	(revision 0)
> > @@ -0,0 +1,49 @@
> > +; RUN: llc -O0 -disable-fp-elim -mtriple=powerpc-unknown-linux-gnu
> > < %s | FileCheck %s -check-prefix=PPC32 +; RUN: llc -O0
> > -disable-fp-elim -mtriple=powerpc64-unknown-linux-gnu < %s |
> > FileCheck %s -check-prefix=PPC64 + +declare void @foo()
> > +
> > +define i32 @test_cr2() nounwind {
> > +entry:
> > +  %ret = alloca i32, align 4
> > +  %0 = call i32 asm sideeffect "\0A\09mtcr $4\0A\09cmp
> > 2,$2,$1\0A\09mfcr $0", "=r,r,r,r,r,~{cr2}"(i32 1, i32 2, i32 3, i32
> > 0) nounwind
> > +  store i32 %0, i32* %ret, align 4
> > +  call void @foo()
> > +  %1 = load i32* %ret, align 4
> > +  ret i32 %1
> > +}
> > +
> > +; PPC32: mfcr 12
> > +; PPC32-NEXT: stw 12, {{[0-9]+}}(31)
> > +; PPC32: lwz 12, {{[0-9]+}}(31)
> > +; PPC32-NEXT: mtcrf 32, 12
> > +
> > +; PPC64: mfcr 12
> > +; PPC64-NEXT: stw 12, 8(1)
> > +; PPC64: lwz 12, 8(1)
> > +; PPC64-NEXT: mtcrf 32, 12
> > +
> > +define i32 @test_cr234() nounwind {
> > +entry:
> > +  %ret = alloca i32, align 4
> > +  %0 = call i32 asm sideeffect "\0A\09mtcr $4\0A\09cmp
> > 2,$2,$1\0A\09cmp 3,$2,$2\0A\09cmp 4,$2,$3\0A\09mfcr $0",
> > "=r,r,r,r,r,~{cr2},~{cr3},~{cr4}"(i32 1, i32 2, i32 3, i32 0)
> > nounwind
> > +  store i32 %0, i32* %ret, align 4
> > +  call void @foo()
> > +  %1 = load i32* %ret, align 4
> > +  ret i32 %1
> > +}
> > +
> > +; PPC32: mfcr 12
> > +; PPC32-NEXT: stw 12, {{[0-9]+}}(31)
> > +; PPC32: lwz 12, {{[0-9]+}}(31)
> > +; PPC32-NEXT: mtcrf 32, 12
> > +; PPC32-NEXT: mtcrf 16, 12
> > +; PPC32-NEXT: mtcrf 8, 12
> > +
> > +; PPC64: mfcr 12
> > +; PPC64-NEXT: stw 12, 8(1)
> > +; PPC64: lwz 12, 8(1)
> > +; PPC64-NEXT: mtcrf 32, 12
> > +; PPC64-NEXT: mtcrf 16, 12
> > +; PPC64-NEXT: mtcrf 8, 12
> > +
> > Index: lib/Target/PowerPC/PPCRegisterInfo.h
> > ===================================================================
> > --- lib/Target/PowerPC/PPCRegisterInfo.h	(revision 163540)
> > +++ lib/Target/PowerPC/PPCRegisterInfo.h	(working copy)
> > @@ -30,6 +30,7 @@ class PPCRegisterInfo : public PPCGenRegisterInfo
> >    std::map<unsigned, unsigned> ImmToIdxMap;
> >    const PPCSubtarget &Subtarget;
> >    const TargetInstrInfo &TII;
> > +  mutable int CRSpillFrameIdx;
> >  public:
> >    PPCRegisterInfo(const PPCSubtarget &SubTarget, const
> > TargetInstrInfo &tii); 
> > @@ -65,6 +66,8 @@ class PPCRegisterInfo : public PPCGenRegisterInfo
> >                         int SPAdj, RegScavenger *RS) const;
> >    void lowerCRRestore(MachineBasicBlock::iterator II, unsigned
> > FrameIndex, int SPAdj, RegScavenger *RS) const;
> > +  bool hasReservedSpillSlot(const MachineFunction &MF, unsigned
> > Reg,
> > +			    int &FrameIdx) const;
> >    void eliminateFrameIndex(MachineBasicBlock::iterator II,
> >                             int SPAdj, RegScavenger *RS = NULL)
> > const; 
> > Index: lib/Target/PowerPC/PPCFrameLowering.h
> > ===================================================================
> > --- lib/Target/PowerPC/PPCFrameLowering.h	(revision 163540)
> > +++ lib/Target/PowerPC/PPCFrameLowering.h	(working copy)
> > @@ -45,6 +45,16 @@ class PPCFrameLowering: public
> > TargetFrameLowering RegScavenger *RS = NULL) const;
> >    void processFunctionBeforeFrameFinalized(MachineFunction &MF)
> > const; 
> > +  bool spillCalleeSavedRegisters(MachineBasicBlock &MBB,
> > +                                 MachineBasicBlock::iterator MI,
> > +                                 const
> > std::vector<CalleeSavedInfo> &CSI,
> > +                                 const TargetRegisterInfo *TRI)
> > const; +
> > +  bool restoreCalleeSavedRegisters(MachineBasicBlock &MBB,
> > +                                   MachineBasicBlock::iterator MI,
> > +                                   const
> > std::vector<CalleeSavedInfo> &CSI,
> > +                                   const TargetRegisterInfo *TRI)
> > const; +
> >    /// targetHandlesStackFrameRounding - Returns true if the target
> > is /// responsible for rounding up the stack frame (probably at
> > emitPrologue /// time).
> > @@ -170,23 +180,11 @@ class PPCFrameLowering: public
> > TargetFrameLowering {PPC::R15, -68},
> >        {PPC::R14, -72},
> >  
> > -      // CR save area offset.
> > -      // FIXME SVR4: Disable CR save area for now.
> > -//      {PPC::CR2, -4},
> > -//      {PPC::CR3, -4},
> > -//      {PPC::CR4, -4},
> > -//      {PPC::CR2LT, -4},
> > -//      {PPC::CR2GT, -4},
> > -//      {PPC::CR2EQ, -4},
> > -//      {PPC::CR2UN, -4},
> > -//      {PPC::CR3LT, -4},
> > -//      {PPC::CR3GT, -4},
> > -//      {PPC::CR3EQ, -4},
> > -//      {PPC::CR3UN, -4},
> > -//      {PPC::CR4LT, -4},
> > -//      {PPC::CR4GT, -4},
> > -//      {PPC::CR4EQ, -4},
> > -//      {PPC::CR4UN, -4},
> > +      // CR save area offset.  We map each of the nonvolatile CR
> > fields
> > +      // to the slot for CR2, which is the first of the
> > nonvolatile CR
> > +      // fields to be assigned, so that we only allocate one save
> > slot.
> > +      // See PPCRegisterInfo::hasReservedSpillSlot() for more
> > information.
> > +      {PPC::CR2, -4},
> >  
> >        // VRSAVE save area offset.
> >        {PPC::VRSAVE, -4},
> > @@ -228,27 +226,6 @@ class PPCFrameLowering: public
> > TargetFrameLowering {PPC::F14, -144},
> >  
> >        // General register save area offsets.
> > -      // FIXME 64-bit SVR4: Are 32-bit registers actually
> > allocated in 64-bit
> > -      //                    mode?
> > -      {PPC::R31, -4},
> > -      {PPC::R30, -12},
> > -      {PPC::R29, -20},
> > -      {PPC::R28, -28},
> > -      {PPC::R27, -36},
> > -      {PPC::R26, -44},
> > -      {PPC::R25, -52},
> > -      {PPC::R24, -60},
> > -      {PPC::R23, -68},
> > -      {PPC::R22, -76},
> > -      {PPC::R21, -84},
> > -      {PPC::R20, -92},
> > -      {PPC::R19, -100},
> > -      {PPC::R18, -108},
> > -      {PPC::R17, -116},
> > -      {PPC::R16, -124},
> > -      {PPC::R15, -132},
> > -      {PPC::R14, -140},
> > -
> >        {PPC::X31, -8},
> >        {PPC::X30, -16},
> >        {PPC::X29, -24},
> > @@ -268,24 +245,6 @@ class PPCFrameLowering: public
> > TargetFrameLowering {PPC::X15, -136},
> >        {PPC::X14, -144},
> >  
> > -      // CR save area offset.
> > -      // FIXME SVR4: Disable CR save area for now.
> > -//      {PPC::CR2, -4},
> > -//      {PPC::CR3, -4},
> > -//      {PPC::CR4, -4},
> > -//      {PPC::CR2LT, -4},
> > -//      {PPC::CR2GT, -4},
> > -//      {PPC::CR2EQ, -4},
> > -//      {PPC::CR2UN, -4},
> > -//      {PPC::CR3LT, -4},
> > -//      {PPC::CR3GT, -4},
> > -//      {PPC::CR3EQ, -4},
> > -//      {PPC::CR3UN, -4},
> > -//      {PPC::CR4LT, -4},
> > -//      {PPC::CR4GT, -4},
> > -//      {PPC::CR4EQ, -4},
> > -//      {PPC::CR4UN, -4},
> > -
> >        // VRSAVE save area offset.
> >        {PPC::VRSAVE, -4},
> >  
> > Index: lib/Target/PowerPC/PPCRegisterInfo.cpp
> > ===================================================================
> > --- lib/Target/PowerPC/PPCRegisterInfo.cpp	(revision 163540)
> > +++ lib/Target/PowerPC/PPCRegisterInfo.cpp	(working copy)
> > @@ -71,7 +71,7 @@ PPCRegisterInfo::PPCRegisterInfo(const PPCSubtarge
> >    : PPCGenRegisterInfo(ST.isPPC64() ? PPC::LR8 : PPC::LR,
> >                         ST.isPPC64() ? 0 : 1,
> >                         ST.isPPC64() ? 0 : 1),
> > -    Subtarget(ST), TII(tii) {
> > +    Subtarget(ST), TII(tii), CRSpillFrameIdx(0) {
> >    ImmToIdxMap[PPC::LD]   = PPC::LDX;    ImmToIdxMap[PPC::STD]  =
> > PPC::STDX; ImmToIdxMap[PPC::LBZ]  = PPC::LBZX;
> > ImmToIdxMap[PPC::STB]  = PPC::STBX; ImmToIdxMap[PPC::LHZ]  =
> > PPC::LHZX;   ImmToIdxMap[PPC::LHA]  = PPC::LHAX; @@ -111,6 +111,11
> > @@ PPCRegisterInfo::getCalleeSavedRegs(const MachineF return
> > Subtarget.isPPC64() ? CSR_Darwin64_SaveList : CSR_Darwin32_SaveList;
> >  
> > +  // For 32-bit SVR4, also initialize the frame index associated
> > with
> > +  // the CR spill slot.
> > +  if (!Subtarget.isPPC64())
> > +    CRSpillFrameIdx = 0;
> > +
> >    return Subtarget.isPPC64() ? CSR_SVR464_SaveList :
> > CSR_SVR432_SaveList; }
> >  
> > @@ -477,6 +482,31 @@ void
> > PPCRegisterInfo::lowerCRRestore(MachineBasicB MBB.erase(II);
> >  }
> >  
> > +bool
> > +PPCRegisterInfo::hasReservedSpillSlot(const MachineFunction &MF,
> > +				      unsigned Reg, int &FrameIdx)
> > const { +
> > +  // For the nonvolatile condition registers (CR2, CR3, CR4) in an
> > SVR4
> > +  // ABI, return true to prevent allocating an additional frame
> > slot.
> > +  // For 64-bit, the CR save area is at SP+8; the value of
> > FrameIdx = 0
> > +  // is arbitrary and will be subsequently ignored.  For 32-bit,
> > we must
> > +  // create exactly one stack slot and return its FrameIdx for all
> > +  // nonvolatiles.
> > +  if (Subtarget.isSVR4ABI() && PPC::CR2 <= Reg && Reg <= PPC::CR4)
> > {
> > +    if (Subtarget.isPPC64()) {
> > +      FrameIdx = 0;
> > +    } else if (CRSpillFrameIdx) {
> > +      FrameIdx = CRSpillFrameIdx;
> > +    } else {
> > +      MachineFrameInfo *MFI = ((MachineFunction
> > &)MF).getFrameInfo();
> > +      FrameIdx = MFI->CreateFixedObject((uint64_t)4, (int64_t)-4,
> > true);
> > +      CRSpillFrameIdx = FrameIdx;
> > +    }
> > +    return true;
> > +  }
> > +  return false;
> > +}
> > +
> >  void
> >  PPCRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator
> > II, int SPAdj, RegScavenger *RS) const {
> > Index: lib/Target/PowerPC/PPCFrameLowering.cpp
> > ===================================================================
> > --- lib/Target/PowerPC/PPCFrameLowering.cpp	(revision 163540)
> > +++ lib/Target/PowerPC/PPCFrameLowering.cpp	(working copy)
> > @@ -13,6 +13,7 @@
> >  
> >  #include "PPCFrameLowering.h"
> >  #include "PPCInstrInfo.h"
> > +#include "PPCInstrBuilder.h"
> >  #include "PPCMachineFunctionInfo.h"
> >  #include "llvm/Function.h"
> >  #include "llvm/CodeGen/MachineFrameInfo.h"
> > @@ -168,6 +169,11 @@ static void HandleVRSaveUpdate(MachineInstr
> > *MI, c MI->eraseFromParent();
> >  }
> >  
> > +static bool spillsCR(const MachineFunction &MF) {
> > +  const PPCFunctionInfo *FuncInfo = MF.getInfo<PPCFunctionInfo>();
> > +  return FuncInfo->isCRSpilled();
> > +}
> > +
> >  /// determineFrameLayout - Determine the size of the frame and
> > maximum call /// frame size.
> >  void PPCFrameLowering::determineFrameLayout(MachineFunction &MF)
> > const { @@ -184,13 +190,21 @@ void
> > PPCFrameLowering::determineFrameLayout(Machin 
> >    // If we are a leaf function, and use up to 224 bytes of stack
> > space, // don't have a frame pointer, calls, or dynamic alloca then
> > we do not need
> > -  // to adjust the stack pointer (we fit in the Red Zone).
> > +  // to adjust the stack pointer (we fit in the Red Zone).  For
> > 64-bit
> > +  // SVR4, we also require a stack frame if we need to spill the
> > CR,
> > +  // since this spill area is addressed relative to the stack
> > pointer. bool DisableRedZone =
> > MF.getFunction()->hasFnAttr(Attribute::NoRedZone);
> > -  // FIXME SVR4 The 32-bit SVR4 ABI has no red zone.
> > +  // FIXME SVR4 The 32-bit SVR4 ABI has no red zone.  However, it
> > can
> > +  // still generate stackless code if all local vars are
> > reg-allocated.
> > +  // Try: (FrameSize <= 224
> > +  //       || (FrameSize == 0 && Subtarget.isPPC32 &&
> > Subtarget.isSVR4ABI())) if (!DisableRedZone &&
> >        FrameSize <= 224 &&                          // Fits in red
> > zone. !MFI->hasVarSizedObjects() &&                // No dynamic
> > alloca. !MFI->adjustsStack() &&                      // No calls.
> > +      !(Subtarget.isPPC64() &&                     // No 64-bit
> > SVR4 CRsave.
> > +	Subtarget.isSVR4ABI()
> > +	&& spillsCR(MF)) &&
> >        (!ALIGN_STACK || MaxAlign <= TargetAlign)) { // No special
> > alignment. // No need for frame
> >      MFI->setStackSize(0);
> > @@ -488,7 +502,6 @@ void
> > PPCFrameLowering::emitPrologue(MachineFunctio // Add callee saved
> > registers to move list. const std::vector<CalleeSavedInfo> &CSI =
> > MFI->getCalleeSavedInfo(); for (unsigned I = 0, E = CSI.size();
> > I != E; ++I) {
> > -      int Offset = MFI->getObjectOffset(CSI[I].getFrameIdx());
> >        unsigned Reg = CSI[I].getReg();
> >        if (Reg == PPC::LR || Reg == PPC::LR8 || Reg == PPC::RM)
> > continue; 
> > @@ -497,6 +510,25 @@ void
> > PPCFrameLowering::emitPrologue(MachineFunctio if
> > (PPC::CRBITRCRegClass.contains(Reg)) continue;
> >  
> > +      // For SVR4, don't emit a move for the CR spill slot if we
> > haven't
> > +      // spilled CRs.
> > +      if (Subtarget.isSVR4ABI()
> > +	  && (PPC::CR2 <= Reg && Reg <= PPC::CR4)
> > +	  && !spillsCR(MF))
> > +	continue;
> > +
> > +      // For 64-bit SVR4 when we have spilled CRs, the spill
> > location
> > +      // is SP+8, not a frame-relative slot.
> > +      if (Subtarget.isSVR4ABI()
> > +	  && Subtarget.isPPC64()
> > +	  && (PPC::CR2 <= Reg && Reg <= PPC::CR4)) {
> > +	MachineLocation CSDst(PPC::X1, 8);
> > +	MachineLocation CSSrc(PPC::CR2);
> > +	Moves.push_back(MachineMove(Label, CSDst, CSSrc));
> > +	continue;
> > +      }
> > +
> > +      int Offset = MFI->getObjectOffset(CSI[I].getFrameIdx());
> >        MachineLocation CSDst(MachineLocation::VirtualFP, Offset);
> >        MachineLocation CSSrc(Reg);
> >        Moves.push_back(MachineMove(Label, CSDst, CSSrc));
> > @@ -714,11 +746,6 @@ void
> > PPCFrameLowering::emitEpilogue(MachineFunctio }
> >  }
> >  
> > -static bool spillsCR(const MachineFunction &MF) {
> > -  const PPCFunctionInfo *FuncInfo = MF.getInfo<PPCFunctionInfo>();
> > -  return FuncInfo->isCRSpilled();
> > -}
> > -
> >  /// MustSaveLR - Return true if this function requires that we
> > save the LR /// register onto the stack in the prolog and restore
> > it in the epilog of the /// function.
> > @@ -808,7 +835,6 @@ void
> > PPCFrameLowering::processFunctionBeforeFrameF bool HasGPSaveArea =
> > false; bool HasG8SaveArea = false;
> >    bool HasFPSaveArea = false;
> > -  bool HasCRSaveArea = false;
> >    bool HasVRSAVESaveArea = false;
> >    bool HasVRSaveArea = false;
> >  
> > @@ -843,10 +869,9 @@ void
> > PPCFrameLowering::processFunctionBeforeFrameF if (Reg < MinFPR) {
> >          MinFPR = Reg;
> >        }
> > -// FIXME SVR4: Disable CR save area for now.
> >      } else if (PPC::CRBITRCRegClass.contains(Reg) ||
> >                 PPC::CRRCRegClass.contains(Reg)) {
> > -//      HasCRSaveArea = true;
> > +      ; // do nothing, as we already know whether CRs are spilled
> >      } else if (PPC::VRSAVERCRegClass.contains(Reg)) {
> >        HasVRSAVESaveArea = true;
> >      } else if (PPC::VRRCRegClass.contains(Reg)) {
> > @@ -926,16 +951,21 @@ void
> > PPCFrameLowering::processFunctionBeforeFrameF }
> >    }
> >  
> > -  // The CR save area is below the general register save area.
> > -  if (HasCRSaveArea) {
> > -    // FIXME SVR4: Is it actually possible to have multiple
> > elements in CSI
> > -    //             which have the CR/CRBIT register class?
> > +  // For 32-bit only, the CR save area is below the general
> > register
> > +  // save area.  For 64-bit SVR4, the CR save area is addressed
> > relative
> > +  // to the stack pointer and hence does not need an adjustment
> > here.
> > +  // Only CR2 (the first nonvolatile spilled) has an associated
> > frame
> > +  // index so that we have a single uniform save area.
> > +  if (spillsCR(MF) && !(Subtarget.isPPC64() &&
> > Subtarget.isSVR4ABI())) { // Adjust the frame index of the CR spill
> > slot. for (unsigned i = 0, e = CSI.size(); i != e; ++i) {
> >        unsigned Reg = CSI[i].getReg();
> >  
> > -      if (PPC::CRBITRCRegClass.contains(Reg) ||
> > -          PPC::CRRCRegClass.contains(Reg)) {
> > +      if ((Subtarget.isSVR4ABI() && Reg == PPC::CR2)
> > +	  // Leave Darwin logic as-is.
> > +	  || (!Subtarget.isSVR4ABI() &&
> > +	      (PPC::CRBITRCRegClass.contains(Reg) ||
> > +	       PPC::CRRCRegClass.contains(Reg)))) {
> >          int FI = CSI[i].getFrameIdx();
> >  
> >          FFI->setObjectOffset(FI, LowerBound +
> > FFI->getObjectOffset(FI)); @@ -973,3 +1003,184 @@ void
> > PPCFrameLowering::processFunctionBeforeFrameF }
> >    }
> >  }
> > +
> > +bool 
> > +PPCFrameLowering::spillCalleeSavedRegisters(MachineBasicBlock &MBB,
> > +				     MachineBasicBlock::iterator
> > MI,
> > +				     const
> > std::vector<CalleeSavedInfo> &CSI,
> > +				     const TargetRegisterInfo
> > *TRI) const { +
> > +  // Currently, this function only handles SVR4 32- and 64-bit
> > ABIs.
> > +  // Return false otherwise to maintain pre-existing behavior.
> > +  if (!Subtarget.isSVR4ABI())
> > +    return false;
> > +
> > +  MachineFunction *MF = MBB.getParent();
> > +  const PPCInstrInfo &TII =
> > +    *static_cast<const
> > PPCInstrInfo*>(MF->getTarget().getInstrInfo());
> > +  DebugLoc DL;
> > +  bool CRSpilled = false;
> > +  
> > +  for (unsigned i = 0, e = CSI.size(); i != e; ++i) {
> > +    unsigned Reg = CSI[i].getReg();
> > +    // CR2 through CR4 are the nonvolatile CR fields.
> > +    bool IsCRField = PPC::CR2 <= Reg && Reg <= PPC::CR4;
> > +
> > +    if (CRSpilled && IsCRField)
> > +      continue;
> > +
> > +    // Add the callee-saved register as live-in; it's killed at
> > the spill.
> > +    MBB.addLiveIn(Reg);
> > +
> > +    // Insert the spill to the stack frame.
> > +    if (IsCRField) {
> > +      CRSpilled = true;
> > +      // The first time we see a CR field, store the whole CR into
> > the
> > +      // save slot via GPR12 (available in the prolog for 32- and
> > 64-bit).
> > +      if (Subtarget.isPPC64()) {
> > +	// 64-bit:  SP+8
> > +	MBB.insert(MI, BuildMI(*MF, DL, TII.get(PPC::MFCR),
> > PPC::X12));
> > +	MBB.insert(MI, BuildMI(*MF, DL, TII.get(PPC::STW))
> > +			       .addReg(PPC::X12,
> > +				       getKillRegState(true))
> > +			       .addImm(8)
> > +			       .addReg(PPC::X1));
> > +      } else {
> > +	// 32-bit:  FP-relative.  Note that we made sure CR2-CR4
> > all have
> > +	// the same frame index in
> > PPCRegisterInfo::hasReservedSpillSlot.
> > +	MBB.insert(MI, BuildMI(*MF, DL, TII.get(PPC::MFCR),
> > PPC::R12));
> > +	MBB.insert(MI, addFrameReference(BuildMI(*MF, DL,
> > TII.get(PPC::STW))
> > +					 .addReg(PPC::R12,
> > +
> > getKillRegState(true)),
> > +					 CSI[i].getFrameIdx()));
> > +      }
> > +      
> > +      // Record that we spill the CR in this function.
> > +      PPCFunctionInfo *FuncInfo = MF->getInfo<PPCFunctionInfo>();
> > +      FuncInfo->setSpillsCR();
> > +    } else {
> > +      const TargetRegisterClass *RC =
> > TRI->getMinimalPhysRegClass(Reg);
> > +      TII.storeRegToStackSlot(MBB, MI, Reg, true,
> > +			      CSI[i].getFrameIdx(), RC, TRI);
> > +    }
> > +  }
> > +  return true;
> > +}
> > +
> > +static void
> > +restoreCRs(bool isPPC64, bool CR2Spilled, bool CR3Spilled, bool
> > CR4Spilled,
> > +	   MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,
> > +	   const std::vector<CalleeSavedInfo> &CSI, unsigned
> > CSIIndex) { +
> > +  MachineFunction *MF = MBB.getParent();
> > +  const PPCInstrInfo &TII =
> > +    *static_cast<const
> > PPCInstrInfo*>(MF->getTarget().getInstrInfo());
> > +  DebugLoc DL;
> > +  unsigned RestoreOp, MoveReg;
> > +
> > +  if (isPPC64) {
> > +    // 64-bit:  SP+8
> > +    MBB.insert(MI, BuildMI(*MF, DL, TII.get(PPC::LWZ), PPC::X12)
> > +	       .addImm(8)
> > +	       .addReg(PPC::X1));
> > +    RestoreOp = PPC::MTCRF8;
> > +    MoveReg = PPC::X12;
> > +  } else {
> > +    // 32-bit:  FP-relative
> > +    MBB.insert(MI, addFrameReference(BuildMI(*MF, DL,
> > TII.get(PPC::LWZ),
> > +					     PPC::R12),
> > +				     CSI[CSIIndex].getFrameIdx()));
> > +    RestoreOp = PPC::MTCRF;
> > +    MoveReg = PPC::R12;
> > +  }
> > +  
> > +  if (CR2Spilled)
> > +    MBB.insert(MI, BuildMI(*MF, DL, TII.get(RestoreOp), PPC::CR2)
> > +	       .addReg(MoveReg));
> > +
> > +  if (CR3Spilled)
> > +    MBB.insert(MI, BuildMI(*MF, DL, TII.get(RestoreOp), PPC::CR3)
> > +	       .addReg(MoveReg));
> > +
> > +  if (CR4Spilled)
> > +    MBB.insert(MI, BuildMI(*MF, DL, TII.get(RestoreOp), PPC::CR4)
> > +	       .addReg(MoveReg));
> > +}
> > +
> > +bool 
> > +PPCFrameLowering::restoreCalleeSavedRegisters(MachineBasicBlock
> > &MBB,
> > +
> > MachineBasicBlock::iterator MI,
> > +				        const
> > std::vector<CalleeSavedInfo> &CSI,
> > +					const TargetRegisterInfo
> > *TRI) const { +
> > +  // Currently, this function only handles SVR4 32- and 64-bit
> > ABIs.
> > +  // Return false otherwise to maintain pre-existing behavior.
> > +  if (!Subtarget.isSVR4ABI())
> > +    return false;
> > +
> > +  MachineFunction *MF = MBB.getParent();
> > +  const PPCInstrInfo &TII =
> > +    *static_cast<const
> > PPCInstrInfo*>(MF->getTarget().getInstrInfo());
> > +  bool CR2Spilled = false;
> > +  bool CR3Spilled = false;
> > +  bool CR4Spilled = false;
> > +  unsigned CSIIndex = 0;
> > +
> > +  // Initialize insertion-point logic; we will be restoring in
> > reverse
> > +  // order of spill.
> > +  MachineBasicBlock::iterator I = MI, BeforeI = I;
> > +  bool AtStart = I == MBB.begin();
> > +
> > +  if (!AtStart)
> > +    --BeforeI;
> > +
> > +  for (unsigned i = 0, e = CSI.size(); i != e; ++i) {
> > +    unsigned Reg = CSI[i].getReg();
> > +
> > +    if (Reg == PPC::CR2) {
> > +      CR2Spilled = true;
> > +      // The spill slot is associated only with CR2, which is the
> > +      // first nonvolatile spilled.  Save it here.
> > +      CSIIndex = i;
> > +      continue;
> > +    } else if (Reg == PPC::CR3) {
> > +      CR3Spilled = true;
> > +      continue;
> > +    } else if (Reg == PPC::CR4) {
> > +      CR4Spilled = true;
> > +      continue;
> > +    } else {
> > +      // When we first encounter a non-CR register after seeing at
> > +      // least one CR register, restore all spilled CRs together.
> > +      if ((CR2Spilled || CR3Spilled || CR4Spilled)
> > +	  && !(PPC::CR2 <= Reg && Reg <= PPC::CR4)) {
> > +	restoreCRs(Subtarget.isPPC64(), CR2Spilled, CR3Spilled,
> > CR4Spilled,
> > +		   MBB, I, CSI, CSIIndex);
> > +	CR2Spilled = CR3Spilled = CR4Spilled = false;
> > +      }
> > +
> > +      // Default behavior for non-CR saves.
> > +      const TargetRegisterClass *RC =
> > TRI->getMinimalPhysRegClass(Reg);
> > +      TII.loadRegFromStackSlot(MBB, I, Reg, CSI[i].getFrameIdx(),
> > +			       RC, TRI);
> > +      assert(I != MBB.begin() &&
> > +	     "loadRegFromStackSlot didn't insert any code!");
> > +      }
> > +
> > +    // Insert in reverse order.
> > +    if (AtStart)
> > +      I = MBB.begin();
> > +    else {
> > +      I = BeforeI;
> > +      ++I;
> > +    }	    
> > +  }
> > +
> > +  // If we haven't yet spilled the CRs, do so now.
> > +  if (CR2Spilled || CR3Spilled || CR4Spilled)
> > +    restoreCRs(Subtarget.isPPC64(), CR2Spilled, CR3Spilled,
> > CR4Spilled,
> > +	       MBB, I, CSI, CSIIndex);
> > +
> > +  return true;
> > +}
> > +
> 



-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory



More information about the llvm-commits mailing list