[llvm] r177666 - Implement builtin_{setjmp/longjmp} on PPC

Bill Schmidt wschmidt at linux.vnet.ibm.com
Mon Mar 25 19:17:07 PDT 2013


Hi Hal,

Sorry it took me so long to get to this.  I have a few comments on this
one, embedded below:

On Thu, 2013-03-21 at 21:37 +0000, Hal Finkel wrote:
> Author: hfinkel
> Date: Thu Mar 21 16:37:52 2013
> New Revision: 177666
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=177666&view=rev
> Log:
> Implement builtin_{setjmp/longjmp} on PPC
> 
> This implements SJLJ lowering on PPC, making the Clang functions
> __builtin_{setjmp/longjmp} functional on PPC platforms. The implementation
> strategy is similar to that on X86, with the exception that a branch-and-link
> variant is used to get the right jump address. Credit goes to Bill Schmidt for
> suggesting the use of the unconditional bcl form (instead of the regular bl
> instruction) to limit return-address-cache pollution.
> 
> Benchmarking the speed at -O3 of:
> 
> static jmp_buf env_sigill;
> 
> void foo() {
>                 __builtin_longjmp(env_sigill,1);
> }
> 
> main() {
> 	...
> 
>         for (int i = 0; i < c; ++i) {
>                 if (__builtin_setjmp(env_sigill)) {
>                         goto done;
>                 } else {
>                         foo();
>                 }
> 
> done:;
>         }
> 
> 	...
> }
> 
> vs. the same code using the libc setjmp/longjmp functions on a P7 shows that
> this builtin implementation is ~4x faster with Altivec enabled and ~7.25x
> faster with Altivec disabled. This comparison is somewhat unfair because the
> libc version must also save/restore the VSX registers which we don't yet
> support.
> 
> Added:
>     llvm/trunk/test/CodeGen/PowerPC/sjlj.ll
> Modified:
>     llvm/trunk/lib/Target/PowerPC/PPCCallingConv.td
>     llvm/trunk/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
>     llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
>     llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h
>     llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td
>     llvm/trunk/lib/Target/PowerPC/PPCInstrFormats.td
>     llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td
>     llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp
>     llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.h
> 
> Modified: llvm/trunk/lib/Target/PowerPC/PPCCallingConv.td
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCCallingConv.td?rev=177666&r1=177665&r2=177666&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/PowerPC/PPCCallingConv.td (original)
> +++ llvm/trunk/lib/Target/PowerPC/PPCCallingConv.td Thu Mar 21 16:37:52 2013
> @@ -136,3 +136,8 @@ def CSR_SVR464   : CalleeSavedRegs<(add
>                                          F27, F28, F29, F30, F31, CR2, CR3, CR4,
>                                          V20, V21, V22, V23, V24, V25, V26, V27,
>                                          V28, V29, V30, V31)>;
> +
> +def CSR_NoRegs : CalleeSavedRegs<(add)>;
> +
> +def CSR_NoRegs_Altivec : CalleeSavedRegs<(add (sequence "V%u", 0, 31), VRSAVE)>;
> +
> 
> Modified: llvm/trunk/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelDAGToDAG.cpp?rev=177666&r1=177665&r2=177666&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/PowerPC/PPCISelDAGToDAG.cpp (original)
> +++ llvm/trunk/lib/Target/PowerPC/PPCISelDAGToDAG.cpp Thu Mar 21 16:37:52 2013
> @@ -164,6 +164,12 @@ namespace {
>        return PPCLowering.SelectAddressRegImmShift(N, Disp, Base, *CurDAG);
>      }
> 
> +    // Select an address into a single register.
> +    bool SelectAddr(SDValue N, SDValue &Base) {
> +      Base = N;
> +      return true;
> +    }
> +
>      /// SelectInlineAsmMemoryOperand - Implement addressing mode selection for
>      /// inline asm expressions.  It is always correct to compute the value into
>      /// a register.  The case of adding a (possibly relocatable) constant to a
> 
> Modified: llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp?rev=177666&r1=177665&r2=177666&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp (original)
> +++ llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp Thu Mar 21 16:37:52 2013
> @@ -70,6 +70,7 @@ static TargetLoweringObjectFile *CreateT
>  PPCTargetLowering::PPCTargetLowering(PPCTargetMachine &TM)
>    : TargetLowering(TM, CreateTLOF(TM)), PPCSubTarget(*TM.getSubtargetImpl()) {
>    const PPCSubtarget *Subtarget = &TM.getSubtarget<PPCSubtarget>();
> +  PPCRegInfo = TM.getRegisterInfo();
> 
>    setPow2DivIsCheap();
> 
> @@ -211,6 +212,14 @@ PPCTargetLowering::PPCTargetLowering(PPC
>    setOperationAction(ISD::EXCEPTIONADDR, MVT::i32, Expand);
>    setOperationAction(ISD::EHSELECTION,   MVT::i32, Expand);
> 
> +  // NOTE: EH_SJLJ_SETJMP/_LONGJMP supported here is NOT intened to support

typo: intended.  (Thanks for the good explanatory note here.)

> +  // SjLj exception handling but a light-weight setjmp/longjmp replacement to
> +  // support continuation, user-level threading, and etc.. As a result, no
> +  // other SjLj exception interfaces are implemented and please don't build
> +  // your own exception handling based on them.
> +  // LLVM/Clang supports zero-cost DWARF exception handling.
> +  setOperationAction(ISD::EH_SJLJ_SETJMP, MVT::i32, Custom);
> +  setOperationAction(ISD::EH_SJLJ_LONGJMP, MVT::Other, Custom);
> 
>    // We want to legalize GlobalAddress and ConstantPool nodes into the
>    // appropriate instructions to materialize the address.
> @@ -567,6 +576,8 @@ const char *PPCTargetLowering::getTarget
>    case PPCISD::BCTRL_Darwin:    return "PPCISD::BCTRL_Darwin";
>    case PPCISD::BCTRL_SVR4:      return "PPCISD::BCTRL_SVR4";
>    case PPCISD::RET_FLAG:        return "PPCISD::RET_FLAG";
> +  case PPCISD::EH_SJLJ_SETJMP:  return "PPCISD::EH_SJLJ_SETJMP";
> +  case PPCISD::EH_SJLJ_LONGJMP: return "PPCISD::EH_SJLJ_LONGJMP";
>    case PPCISD::MFCR:            return "PPCISD::MFCR";
>    case PPCISD::VCMP:            return "PPCISD::VCMP";
>    case PPCISD::VCMPo:           return "PPCISD::VCMPo";
> @@ -4561,6 +4572,21 @@ SDValue PPCTargetLowering::LowerDYNAMIC_
>    return DAG.getNode(PPCISD::DYNALLOC, dl, VTs, Ops, 3);
>  }
> 
> +SDValue PPCTargetLowering::lowerEH_SJLJ_SETJMP(SDValue Op,
> +                                               SelectionDAG &DAG) const {
> +  DebugLoc DL = Op.getDebugLoc();
> +  return DAG.getNode(PPCISD::EH_SJLJ_SETJMP, DL,
> +                     DAG.getVTList(MVT::i32, MVT::Other),
> +                     Op.getOperand(0), Op.getOperand(1));
> +}
> +
> +SDValue PPCTargetLowering::lowerEH_SJLJ_LONGJMP(SDValue Op,
> +                                                SelectionDAG &DAG) const {
> +  DebugLoc DL = Op.getDebugLoc();
> +  return DAG.getNode(PPCISD::EH_SJLJ_LONGJMP, DL, MVT::Other,
> +                     Op.getOperand(0), Op.getOperand(1));
> +}
> +
>  /// LowerSELECT_CC - Lower floating point select_cc's into fsel instruction when
>  /// possible.
>  SDValue PPCTargetLowering::LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) const {
> @@ -5557,6 +5583,9 @@ SDValue PPCTargetLowering::LowerOperatio
>    case ISD::DYNAMIC_STACKALLOC:
>      return LowerDYNAMIC_STACKALLOC(Op, DAG, PPCSubTarget);
> 
> +  case ISD::EH_SJLJ_SETJMP:     return lowerEH_SJLJ_SETJMP(Op, DAG);
> +  case ISD::EH_SJLJ_LONGJMP:    return lowerEH_SJLJ_LONGJMP(Op, DAG);
> +
>    case ISD::SELECT_CC:          return LowerSELECT_CC(Op, DAG);
>    case ISD::FP_TO_UINT:
>    case ISD::FP_TO_SINT:         return LowerFP_TO_INT(Op, DAG,
> @@ -5869,9 +5898,238 @@ PPCTargetLowering::EmitPartwordAtomicBin
>    return BB;
>  }
> 
> +llvm::MachineBasicBlock*
> +PPCTargetLowering::emitEHSjLjSetJmp(MachineInstr *MI,
> +                                    MachineBasicBlock *MBB) const {
> +  DebugLoc DL = MI->getDebugLoc();
> +  const TargetInstrInfo *TII = getTargetMachine().getInstrInfo();
> +
> +  MachineFunction *MF = MBB->getParent();
> +  MachineRegisterInfo &MRI = MF->getRegInfo();
> +
> +  const BasicBlock *BB = MBB->getBasicBlock();
> +  MachineFunction::iterator I = MBB;
> +  ++I;
> +
> +  // Memory Reference
> +  MachineInstr::mmo_iterator MMOBegin = MI->memoperands_begin();
> +  MachineInstr::mmo_iterator MMOEnd = MI->memoperands_end();
> +
> +  unsigned DstReg = MI->getOperand(0).getReg();
> +  const TargetRegisterClass *RC = MRI.getRegClass(DstReg);
> +  assert(RC->hasType(MVT::i32) && "Invalid destination!");
> +  unsigned mainDstReg = MRI.createVirtualRegister(RC);
> +  unsigned restoreDstReg = MRI.createVirtualRegister(RC);
> +
> +  MVT PVT = getPointerTy();
> +  assert((PVT == MVT::i64 || PVT == MVT::i32) &&
> +         "Invalid Pointer Size!");
> +  // For v = setjmp(buf), we generate
> +  //
> +  // thisMBB:
> +  //  SjLjSetup mainMBB
> +  //  bl mainMBB
> +  //  v_restore = 1
> +  //  b sinkMBB

This comment does not match the order in which you generate the code,
and in which you check for it in the test case.  More on this later...

> +  //
> +  // mainMBB:
> +  //  buf[LabelOffset] = LR
> +  //  v_main = 0
> +  //
> +  // sinkMBB:
> +  //  v = phi(main, restore)
> +  //
> +
> +  MachineBasicBlock *thisMBB = MBB;
> +  MachineBasicBlock *mainMBB = MF->CreateMachineBasicBlock(BB);
> +  MachineBasicBlock *sinkMBB = MF->CreateMachineBasicBlock(BB);
> +  MF->insert(I, mainMBB);
> +  MF->insert(I, sinkMBB);
> +
> +  MachineInstrBuilder MIB;
> +
> +  // Transfer the remainder of BB and its successor edges to sinkMBB.
> +  sinkMBB->splice(sinkMBB->begin(), MBB,
> +                  llvm::next(MachineBasicBlock::iterator(MI)), MBB->end());
> +  sinkMBB->transferSuccessorsAndUpdatePHIs(MBB);
> +
> +  // Note that the structure of the jmp_buf used here is not compatible
> +  // with that used by libc, and is not designed to be. Specifically, it
> +  // stores only those 'reserved' registers that LLVM does not otherwise
> +  // understand how to spill. Also, by convention, by the time this
> +  // intrinsic is called, Clang has already stored the frame address in the
> +  // first slot of the buffer and stack address in the third. Following the
> +  // X86 target code, we'll store the jump address in the second slot. We also
> +  // need to save the TOC pointer (R2) to handle jumps between shared
> +  // libraries, and that will be stored in the fourth slot. The thread
> +  // identifier (R13) is not affected.
> +
> +  // thisMBB:
> +  const int64_t LabelOffset = 1 * PVT.getStoreSize();
> +  const int64_t TOCOffset   = 3 * PVT.getStoreSize();
> +
> +  // Prepare IP either in reg.

Sorry, but I can't parse that comment. ;)

> +  const TargetRegisterClass *PtrRC = getRegClassFor(PVT);
> +  unsigned LabelReg = MRI.createVirtualRegister(PtrRC);
> +  unsigned BufReg = MI->getOperand(1).getReg();
> +
> +  if (PPCSubTarget.isPPC64() && PPCSubTarget.isSVR4ABI()) {
> +    MIB = BuildMI(*thisMBB, MI, DL, TII->get(PPC::STD))
> +            .addReg(PPC::X2)
> +            .addImm(TOCOffset / 4)
> +            .addReg(BufReg);
> +
> +    MIB.setMemRefs(MMOBegin, MMOEnd);
> +  }
> +
> +  // Setup
> +  MIB = BuildMI(*thisMBB, MI, DL, TII->get(PPC::BCL)).addMBB(mainMBB);

Your use of the "bcl" isn't actually the one that avoids polluting the
link stack.  Sorry for any confusion on this point during our IRC
discussion.  The only form of this instruction that does this is:

  bcl 20,31,$+4

which is treated as a not-taken branch rather than a subroutine call.
If you use any other address than $+4, the processor implementation is
likely to treat it as a subroutine call and modify the link stack.

The best reference on this subject I've seen so far is the Freescale
document that Gabor Greif pointed me to the other day:

  http://www.freescale.com/files/32bit/doc/app_note/AN2203.pdf#page=23

Please have a look at section 4.5 for the basic description, and then
the example in 4.5.2 for how to generate code that branches to a nearby
address without polluting the link stack.  As I was describing (poorly)
on IRC, you need to do arithmetic on the next-address retrieved from the
link register.  A complete example is shown there which I think you will
find very helpful.

> +  MIB.addRegMask(PPCRegInfo->getNoPreservedMask());

I'm too thick to understand this without some more explanation.  What
appears to be happening here is that you're putting a marker in the
instruction stream that says, "Spill and reload all callee-save
registers (that you've modified locally) around this marker."  Is that
the intent, with the register allocator automatically inserting the
necessary spill code?

With the tentative assumption that that's so, the placement of SjLjSetup
following the jump to mainMBB perplexes me.  This code won't be executed
until after a longjmp returns, so the callee-saved registers won't be
saved in advance and will be possibly overwritten between "now" and the
longjmp execution.  It seems to me that the location you specified in
your earlier comment is the correct place to force the state to be
saved.  However, if SjLjSetup is placed there, then on return from the
longjmp there doesn't seem to be anything that will cause the state to
be restored.  From our earlier discussions I got the impression you
wanted all the save/restore to be done in the setjmp code, but I don't
understand how what you have here accomplishes that.  No doubt I'm just
not seeing the whole picture.

In any event, this needs Extreme Commentary ;) to allow others to
understand and maintain the code... 

> +
> +  BuildMI(*thisMBB, MI, DL, TII->get(PPC::LI), restoreDstReg).addImm(1);
> +
> +  MIB = BuildMI(*thisMBB, MI, DL, TII->get(PPC::EH_SjLj_Setup))
> +          .addMBB(mainMBB);
> +  MIB = BuildMI(*thisMBB, MI, DL, TII->get(PPC::B)).addMBB(sinkMBB);
> +
> +  thisMBB->addSuccessor(mainMBB, /* weight */ 0);
> +  thisMBB->addSuccessor(sinkMBB, /* weight */ 1);
> +
> +  // mainMBB:
> +  //  mainDstReg = 0

The above comment belongs after the MFLR and IP store code...

> +  MIB = BuildMI(mainMBB, DL,
> +    TII->get(PPCSubTarget.isPPC64() ? PPC::MFLR8 : PPC::MFLR), LabelReg);
> +
> +  // Store IP
> +  if (PPCSubTarget.isPPC64()) {
> +    MIB = BuildMI(mainMBB, DL, TII->get(PPC::STD))
> +            .addReg(LabelReg)
> +            .addImm(LabelOffset / 4)
> +            .addReg(BufReg);
> +  } else {
> +    MIB = BuildMI(mainMBB, DL, TII->get(PPC::STW))
> +            .addReg(LabelReg)
> +            .addImm(LabelOffset)
> +            .addReg(BufReg);
> +  }
> +
> +  MIB.setMemRefs(MMOBegin, MMOEnd);

...here.

> +
> +  BuildMI(mainMBB, DL, TII->get(PPC::LI), mainDstReg).addImm(0);
> +  mainMBB->addSuccessor(sinkMBB);
> +
> +  // sinkMBB:
> +  BuildMI(*sinkMBB, sinkMBB->begin(), DL,
> +          TII->get(PPC::PHI), DstReg)
> +    .addReg(mainDstReg).addMBB(mainMBB)
> +    .addReg(restoreDstReg).addMBB(thisMBB);
> +
> +  MI->eraseFromParent();
> +  return sinkMBB;
> +}
> +
> +MachineBasicBlock *
> +PPCTargetLowering::emitEHSjLjLongJmp(MachineInstr *MI,
> +                                     MachineBasicBlock *MBB) const {
> +  DebugLoc DL = MI->getDebugLoc();
> +  const TargetInstrInfo *TII = getTargetMachine().getInstrInfo();
> +
> +  MachineFunction *MF = MBB->getParent();
> +  MachineRegisterInfo &MRI = MF->getRegInfo();
> +
> +  // Memory Reference
> +  MachineInstr::mmo_iterator MMOBegin = MI->memoperands_begin();
> +  MachineInstr::mmo_iterator MMOEnd = MI->memoperands_end();
> +
> +  MVT PVT = getPointerTy();
> +  assert((PVT == MVT::i64 || PVT == MVT::i32) &&
> +         "Invalid Pointer Size!");
> +
> +  const TargetRegisterClass *RC =
> +    (PVT == MVT::i64) ? &PPC::G8RCRegClass : &PPC::GPRCRegClass;
> +  unsigned Tmp = MRI.createVirtualRegister(RC);
> +  // Since FP is only updated here but NOT referenced, it's treated as GPR.

Sorry, I don't comprehend this comment either.  Can you elaborate?

> +  unsigned FP  = (PVT == MVT::i64) ? PPC::X31 : PPC::R31;
> +  unsigned SP  = (PVT == MVT::i64) ? PPC::X1 : PPC::R1;
> +
> +  MachineInstrBuilder MIB;
> +
> +  const int64_t LabelOffset = 1 * PVT.getStoreSize();
> +  const int64_t SPOffset    = 2 * PVT.getStoreSize();
> +  const int64_t TOCOffset   = 3 * PVT.getStoreSize();
> +
> +  unsigned BufReg = MI->getOperand(0).getReg();
> +
> +  // Reload FP (the jumped-to function may not have had a
> +  // frame pointer, and if so, then its r31 will be restored
> +  // as necessary).
> +  if (PVT == MVT::i64) {
> +    MIB = BuildMI(*MBB, MI, DL, TII->get(PPC::LD), FP)
> +            .addImm(0)
> +            .addReg(BufReg);
> +  } else {
> +    MIB = BuildMI(*MBB, MI, DL, TII->get(PPC::LWZ), FP)
> +            .addImm(0)
> +            .addReg(BufReg);
> +  }
> +  MIB.setMemRefs(MMOBegin, MMOEnd);
> +
> +  // Reload IP
> +  if (PVT == MVT::i64) {
> +    MIB = BuildMI(*MBB, MI, DL, TII->get(PPC::LD), Tmp)
> +            .addImm(LabelOffset / 4)
> +            .addReg(BufReg);
> +  } else {
> +    MIB = BuildMI(*MBB, MI, DL, TII->get(PPC::LWZ), Tmp)
> +            .addImm(LabelOffset)
> +            .addReg(BufReg);
> +  }
> +  MIB.setMemRefs(MMOBegin, MMOEnd);
> +
> +  // Reload SP
> +  if (PVT == MVT::i64) {
> +    MIB = BuildMI(*MBB, MI, DL, TII->get(PPC::LD), SP)
> +            .addImm(SPOffset / 4)
> +            .addReg(BufReg);
> +  } else {
> +    MIB = BuildMI(*MBB, MI, DL, TII->get(PPC::LWZ), SP)
> +            .addImm(SPOffset)
> +            .addReg(BufReg);
> +  }
> +  MIB.setMemRefs(MMOBegin, MMOEnd);
> +
> +  // FIXME: When we also support base pointers, that register must also be
> +  // restored here.

Once again, I don't understand.  I'm starting to feel pretty dumb. ;)
To what sort of base pointers are you referring, and what register is
this?

> +
> +  // Reload TOC
> +  if (PVT == MVT::i64 && PPCSubTarget.isSVR4ABI()) {
> +    MIB = BuildMI(*MBB, MI, DL, TII->get(PPC::LD), PPC::X2)
> +            .addImm(TOCOffset / 4)
> +            .addReg(BufReg);
> +
> +    MIB.setMemRefs(MMOBegin, MMOEnd);
> +  }
> +
> +  // Jump
> +  BuildMI(*MBB, MI, DL,
> +          TII->get(PVT == MVT::i64 ? PPC::MTCTR8 : PPC::MTCTR)).addReg(Tmp);
> +  BuildMI(*MBB, MI, DL, TII->get(PVT == MVT::i64 ? PPC::BCTR8 : PPC::BCTR));
> +
> +  MI->eraseFromParent();
> +  return MBB;
> +}
> +
>  MachineBasicBlock *
>  PPCTargetLowering::EmitInstrWithCustomInserter(MachineInstr *MI,
>                                                 MachineBasicBlock *BB) const {
> +  if (MI->getOpcode() == PPC::EH_SjLj_SetJmp32 ||
> +      MI->getOpcode() == PPC::EH_SjLj_SetJmp64) {
> +    return emitEHSjLjSetJmp(MI, BB);
> +  } else if (MI->getOpcode() == PPC::EH_SjLj_LongJmp32 ||
> +             MI->getOpcode() == PPC::EH_SjLj_LongJmp64) {
> +    return emitEHSjLjLongJmp(MI, BB);
> +  }
> +
>    const TargetInstrInfo *TII = getTargetMachine().getInstrInfo();
> 
>    // To "insert" these instructions we actually have to insert their
> 
> Modified: llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h?rev=177666&r1=177665&r2=177666&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h (original)
> +++ llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h Thu Mar 21 16:37:52 2013
> @@ -16,6 +16,7 @@
>  #define LLVM_TARGET_POWERPC_PPC32ISELLOWERING_H
> 
>  #include "PPC.h"
> +#include "PPCRegisterInfo.h"
>  #include "PPCSubtarget.h"
>  #include "llvm/CodeGen/SelectionDAG.h"
>  #include "llvm/Target/TargetLowering.h"
> @@ -119,6 +120,12 @@ namespace llvm {
>        /// are undefined.
>        MFCR,
> 
> +      // EH_SJLJ_SETJMP - SjLj exception handling setjmp.
> +      EH_SJLJ_SETJMP,
> +
> +      // EH_SJLJ_LONGJMP - SjLj exception handling longjmp.
> +      EH_SJLJ_LONGJMP,
> +
>        /// RESVEC = VCMP(LHS, RHS, OPC) - Represents one of the altivec VCMP*
>        /// instructions.  For lack of better number, we use the opcode number
>        /// encoding for the OPC field to identify the compare.  For example, 838
> @@ -321,6 +328,7 @@ namespace llvm {
> 
>    class PPCTargetLowering : public TargetLowering {
>      const PPCSubtarget &PPCSubTarget;
> +    const PPCRegisterInfo *PPCRegInfo;
> 
>    public:
>      explicit PPCTargetLowering(PPCTargetMachine &TM);
> @@ -395,6 +403,12 @@ namespace llvm {
>                                                  MachineBasicBlock *MBB,
>                                              bool is8bit, unsigned Opcode) const;
> 
> +    MachineBasicBlock *emitEHSjLjSetJmp(MachineInstr *MI,
> +                                        MachineBasicBlock *MBB) const;
> +
> +    MachineBasicBlock *emitEHSjLjLongJmp(MachineInstr *MI,
> +                                         MachineBasicBlock *MBB) const;
> +
>      ConstraintType getConstraintType(const std::string &Constraint) const;
> 
>      /// Examine constraint string and operand type and determine a weight value.
> @@ -608,6 +622,9 @@ namespace llvm {
>                       const SmallVectorImpl<ISD::InputArg> &Ins,
>                       DebugLoc dl, SelectionDAG &DAG,
>                       SmallVectorImpl<SDValue> &InVals) const;
> +
> +    SDValue lowerEH_SJLJ_SETJMP(SDValue Op, SelectionDAG &DAG) const;
> +    SDValue lowerEH_SJLJ_LONGJMP(SDValue Op, SelectionDAG &DAG) const;
>    };
>  }
> 
> 
> Modified: llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td?rev=177666&r1=177665&r2=177666&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td (original)
> +++ llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td Thu Mar 21 16:37:52 2013
> @@ -273,6 +273,19 @@ def MFCR8 : XFXForm_3<31, 19, (outs G8RC
>                       "mfcr $rT", SprMFCR>,
>                       PPC970_MicroCode, PPC970_Unit_CRU;
> 
> +let hasSideEffects = 1, isBarrier = 1, isCodeGenOnly = 1,
> +    usesCustomInserter = 1 in {
> +  def EH_SjLj_SetJmp64  : Pseudo<(outs GPRC:$dst), (ins memr:$buf),
> +                            "#EH_SJLJ_SETJMP64",
> +                            [(set GPRC:$dst, (PPCeh_sjlj_setjmp addr:$buf))]>,
> +                          Requires<[In64BitMode]>;
> +  let isTerminator = 1 in
> +  def EH_SjLj_LongJmp64 : Pseudo<(outs), (ins memr:$buf),
> +                            "#EH_SJLJ_LONGJMP64",
> +                            [(PPCeh_sjlj_longjmp addr:$buf)]>,
> +                          Requires<[In64BitMode]>;
> +}
> +
>  //===----------------------------------------------------------------------===//
>  // 64-bit SPR manipulation instrs.
> 
> 
> Modified: llvm/trunk/lib/Target/PowerPC/PPCInstrFormats.td
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCInstrFormats.td?rev=177666&r1=177665&r2=177666&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/PowerPC/PPCInstrFormats.td (original)
> +++ llvm/trunk/lib/Target/PowerPC/PPCInstrFormats.td Thu Mar 21 16:37:52 2013
> @@ -120,6 +120,18 @@ class BForm_1<bits<6> opcode, bits<5> bo
>    let CR = 0;
>  }
> 
> +class BForm_2<bits<6> opcode, bits<5> bo, bits<5> bi, bit aa, bit lk,
> +              dag OOL, dag IOL, string asmstr>
> +  : I<opcode, OOL, IOL, asmstr, BrB> {
> +  bits<14> BD;
> +
> +  let Inst{6-10}  = bo;
> +  let Inst{11-15} = bi;
> +  let Inst{16-29} = BD;
> +  let Inst{30}    = aa;
> +  let Inst{31}    = lk;
> +}
> +
>  // 1.7.4 D-Form
>  class DForm_base<bits<6> opcode, dag OOL, dag IOL, string asmstr,
>                   InstrItinClass itin, list<dag> pattern> 
> 
> Modified: llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td?rev=177666&r1=177665&r2=177666&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td (original)
> +++ llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td Thu Mar 21 16:37:52 2013
> @@ -158,6 +158,14 @@ def retflag       : SDNode<"PPCISD::RET_
>  def PPCtc_return : SDNode<"PPCISD::TC_RETURN", SDT_PPCTC_ret,
>                          [SDNPHasChain,  SDNPOptInGlue, SDNPVariadic]>;
> 
> +def PPCeh_sjlj_setjmp  : SDNode<"PPCISD::EH_SJLJ_SETJMP",
> +                                SDTypeProfile<1, 1, [SDTCisInt<0>,
> +                                                     SDTCisPtrTy<1>]>,
> +                                [SDNPHasChain, SDNPSideEffect]>;
> +def PPCeh_sjlj_longjmp : SDNode<"PPCISD::EH_SJLJ_LONGJMP",
> +                                SDTypeProfile<0, 1, [SDTCisPtrTy<0>]>,
> +                                [SDNPHasChain, SDNPSideEffect]>;
> +
>  def PPCvcmp       : SDNode<"PPCISD::VCMP" , SDT_PPCvcmp, []>;
>  def PPCvcmp_o     : SDNode<"PPCISD::VCMPo", SDT_PPCvcmp, [SDNPOutGlue]>;
> 
> @@ -391,6 +399,12 @@ def memrix : Operand<iPTR> {   // memri
>    let EncoderMethod = "getMemRIXEncoding";
>  }
> 
> +// A single-register address. This is used with the SjLj
> +// pseudo-instructions.
> +def memr : Operand<iPTR> {
> +  let MIOperandInfo = (ops ptr_rc:$ptrreg);
> +}
> +
>  // PowerPC Predicate operand.  20 = (0<<5)|20 = always, CR0 is a dummy reg
>  // that doesn't matter.
>  def pred : PredicateOperand<OtherVT, (ops imm, CRRC),
> @@ -404,6 +418,10 @@ def xaddr  : ComplexPattern<iPTR, 2, "Se
>  def xoaddr : ComplexPattern<iPTR, 2, "SelectAddrIdxOnly",[], []>;
>  def ixaddr : ComplexPattern<iPTR, 2, "SelectAddrImmShift", [], []>; // "std"
> 
> +// The address in a single register. This is used with the SjLj
> +// pseudo-instructions.
> +def addr   : ComplexPattern<iPTR, 1, "SelectAddr",[], []>;
> +
>  /// This is just the offset part of iaddr, used for preinc.
>  def iaddroff : ComplexPattern<iPTR, 1, "SelectAddrImmOffs", [], []>;
>  def xaddroff : ComplexPattern<iPTR, 1, "SelectAddrIdxOffs", [], []>;
> @@ -505,6 +523,14 @@ let isBranch = 1, isTerminator = 1, hasC
>    }
>  }
> 
> +// The direct BCL used by the SjLj setjmp code.
> +let isCall = 1, hasCtrlDep = 1, PPC970_Unit = 7 in {
> +  let Defs = [LR], Uses = [RM] in {
> +    def BCL  : BForm_2<16, 20, 31, 0, 1, (outs), (ins condbrtarget:$dst),
> +                       "bcl 20, 31, $dst">;
> +  }
> +}
> +

I don't think this should be so specific to the particular use.  Please
make the "20" and "31" into variables that will be bound to the BO and
BI fields so we have a single BCL instruction for general use.  Ulrich
has pointed out that we need a single form of each instruction for the
assembly parser and disassembler to work best, and the above would
encourage somebody needing a different form of BCL to create their own
version.  I'd rather just see the fully flexible version done up front.
We need it anyway for optimizing conditional branches that target calls
to become conditional calls.

>  // Darwin ABI Calls.
>  let isCall = 1, PPC970_Unit = 7, Defs = [LR] in {
>    // Convenient aliases for call instructions
> @@ -583,6 +609,23 @@ def TAILBA   : IForm<18, 0, 0, (outs), (
>                    "ba $dst", BrB,
>                    []>;
> 
> +let hasSideEffects = 1, isBarrier = 1, isCodeGenOnly = 1,
> +    usesCustomInserter = 1 in {
> +  def EH_SjLj_SetJmp32  : Pseudo<(outs GPRC:$dst), (ins memr:$buf),
> +                            "#EH_SJLJ_SETJMP32",
> +                            [(set GPRC:$dst, (PPCeh_sjlj_setjmp addr:$buf))]>,
> +                          Requires<[In32BitMode]>;
> +  let isTerminator = 1 in
> +  def EH_SjLj_LongJmp32 : Pseudo<(outs), (ins memr:$buf),
> +                            "#EH_SJLJ_LONGJMP32",
> +                            [(PPCeh_sjlj_longjmp addr:$buf)]>,
> +                          Requires<[In32BitMode]>;
> +}
> +
> +let isBranch = 1, isTerminator = 1, isCodeGenOnly = 1 in {
> +  def EH_SjLj_Setup : Pseudo<(outs), (ins directbrtarget:$dst),
> +                        "#EH_SjLj_Setup\t$dst", []>;
> +}
> 
>  // DCB* instructions.
>  def DCBA   : DCB_Form<758, 0, (outs), (ins memrr:$dst),
> 
> Modified: llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp?rev=177666&r1=177665&r2=177666&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp (original)
> +++ llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp Thu Mar 21 16:37:52 2013
> @@ -105,6 +105,18 @@ PPCRegisterInfo::getCallPreservedMask(Ca
>    return Subtarget.isPPC64() ? CSR_SVR464_RegMask : CSR_SVR432_RegMask;
>  }
> 
> +const uint32_t*
> +PPCRegisterInfo::getNoPreservedMask() const {
> +  // The naming here is inverted: The CSR_NoRegs_Altivec has the
> +  // Altivec registers masked so that they're not saved and restored around
> +  // instructions with this preserved mask.
> +
> +  if (!Subtarget.hasAltivec())
> +    return CSR_NoRegs_Altivec_RegMask;
> +
> +  return CSR_NoRegs_RegMask;
> +}
> +
>  BitVector PPCRegisterInfo::getReservedRegs(const MachineFunction &MF) const {
>    BitVector Reserved(getNumRegs());
>    const PPCFrameLowering *PPCFI =
> 
> Modified: llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.h?rev=177666&r1=177665&r2=177666&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.h (original)
> +++ llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.h Thu Mar 21 16:37:52 2013
> @@ -44,6 +44,7 @@ public:
>    /// Code Generation virtual methods...
>    const uint16_t *getCalleeSavedRegs(const MachineFunction* MF = 0) const;
>    const uint32_t *getCallPreservedMask(CallingConv::ID CC) const;
> +  const uint32_t *getNoPreservedMask() const;
> 
>    BitVector getReservedRegs(const MachineFunction &MF) const;
> 
> 
> Added: llvm/trunk/test/CodeGen/PowerPC/sjlj.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/sjlj.ll?rev=177666&view=auto
> ==============================================================================
> --- llvm/trunk/test/CodeGen/PowerPC/sjlj.ll (added)
> +++ llvm/trunk/test/CodeGen/PowerPC/sjlj.ll Thu Mar 21 16:37:52 2013
> @@ -0,0 +1,108 @@
> +; RUN: llc < %s -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr7 | FileCheck %s
> +; RUN: llc < %s -mtriple=powerpc64-unknown-linux-gnu -mcpu=a2 | FileCheck -check-prefix=CHECK-NOAV %s
> +target datalayout = "E-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f128:128:128-v128:128:128-n32:64"
> +target triple = "powerpc64-unknown-linux-gnu"
> +
> +%struct.__jmp_buf_tag = type { [64 x i64], i32, %struct.__sigset_t, [8 x i8] }
> +%struct.__sigset_t = type { [16 x i64] }
> +
> + at env_sigill = internal global [1 x %struct.__jmp_buf_tag] zeroinitializer, align 16
> +
> +define void @foo() #0 {
> +entry:
> +  call void @llvm.eh.sjlj.longjmp(i8* bitcast ([1 x %struct.__jmp_buf_tag]* @env_sigill to i8*))
> +  unreachable
> +
> +; CHECK: @foo
> +; CHECK: addis [[REG:[0-9]+]], 2, env_sigill at toc@ha
> +; CHECK: addi [[REG]], [[REG]], env_sigill at toc@l

This is a nitpick, but the fact that the two registers on the addi are
the same is an artifact of the current register allocation scheme and
might not always hold.  It would be safer to not assume they're the
same.

> +; CHECK: ld 31, 0([[REG]])
> +; CHECK: ld [[REG2:[0-9]+]], 8([[REG]])
> +; CHECK: ld 1, 16([[REG]])
> +; CHECK: mtctr [[REG2]]
> +; CHECK: ld 2, 24([[REG]])
> +; CHECK: bctr
> +
> +return:                                           ; No predecessors!
> +  ret void
> +}
> +
> +declare void @llvm.eh.sjlj.longjmp(i8*) #1
> +
> +define signext i32 @main() #0 {
> +entry:
> +  %retval = alloca i32, align 4
> +  store i32 0, i32* %retval
> +  %0 = call i8* @llvm.frameaddress(i32 0)
> +  store i8* %0, i8** bitcast ([1 x %struct.__jmp_buf_tag]* @env_sigill to i8**)
> +  %1 = call i8* @llvm.stacksave()
> +  store i8* %1, i8** getelementptr (i8** bitcast ([1 x %struct.__jmp_buf_tag]* @env_sigill to i8**), i32 2)
> +  %2 = call i32 @llvm.eh.sjlj.setjmp(i8* bitcast ([1 x %struct.__jmp_buf_tag]* @env_sigill to i8*))
> +  %tobool = icmp ne i32 %2, 0
> +  br i1 %tobool, label %if.then, label %if.else
> +
> +if.then:                                          ; preds = %entry
> +  store i32 1, i32* %retval
> +  br label %return
> +
> +if.else:                                          ; preds = %entry
> +  call void @foo()
> +  br label %if.end
> +
> +if.end:                                           ; preds = %if.else
> +  store i32 0, i32* %retval
> +  br label %return
> +
> +return:                                           ; preds = %if.end, %if.then
> +  %3 = load i32* %retval
> +  ret i32 %3
> +
> +; CHECK: @main
> +; CHECK: std
> +; CHECK: stfd
> +; CHECK: stvx
> +
> +; CHECK: addis [[REG:[0-9]+]], 2, env_sigill at toc@ha
> +; CHECK: std 31, env_sigill at toc@l([[REG]])
> +; CHECK: addi [[REG]], [[REG]], env_sigill at toc@l

Same nitpick here.

Thanks,
Bill

> +; CHECK: std [[REG]], [[OFF:[0-9]+]](31)                  # 8-byte Folded Spill
> +; CHECK: std 1, 16([[REG]])
> +; CHECK: std 2, 24([[REG]])
> +; CHECK: bcl 20, 31, .LBB1_1
> +; CHECK: li 3, 1
> +; CHECK: #EH_SjLj_Setup	.LBB1_1
> +; CHECK: b .LBB1_2
> +
> +; CHECK: .LBB1_1:
> +; CHECK: mflr [[REGL:[0-9]+]]
> +; CHECK: ld [[REG2:[0-9]+]], [[OFF]](31)                   # 8-byte Folded Reload
> +; CHECK: std [[REGL]], 8([[REG2]])
> +; CHECK: li 3, 0
> +
> +; CHECK: .LBB1_2:
> +
> +; CHECK: lfd
> +; CHECK: lvx
> +; CHECK: ld
> +; CHECK: blr
> +
> +; CHECK-NOAV: @main
> +; CHECK-NOAV-NOT: stvx
> +; CHECK-NOAV: bcl
> +; CHECK-NOAV: mflr
> +; CHECK-NOAV: bl foo
> +; CHECK-NOAV-NOT: lvx
> +; CHECK-NOAV: blr
> +}
> +
> +declare i8* @llvm.frameaddress(i32) #2
> +
> +declare i8* @llvm.stacksave() #3
> +
> +declare i32 @llvm.eh.sjlj.setjmp(i8*) #3
> +
> +attributes #0 = { nounwind "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-frame-pointer-elim-non-leaf"="true" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "unsafe-fp-math"="false" "use-soft-float"="false" }
> +attributes #1 = { noreturn nounwind }
> +attributes #2 = { nounwind readnone }
> +attributes #3 = { nounwind }
> +
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 




More information about the llvm-commits mailing list