<html><head><meta http-equiv="Content-Type" content="text/html charset=iso-8859-1"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><br><div><div>On Oct 23, 2013, at 12:32 PM, Manman Ren <<a href="mailto:manman.ren@gmail.com">manman.ren@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div dir="ltr">Hi David,<div><br></div><div>Can we use helper functions for emitUnitLoad|Store emitByteLoad|Store to reduce code duplication?</div><div><br></div><div>Thanks,</div><div>Manman</div></div></blockquote><div><br></div>Yes, that would be great, if it works.</div><div><br></div><div>The only other thing I noticed is that this patch have several places where every it creates a scratch virtual register on every branch of a conditional. Those could be moved outside the conditionals.</div><div><br><blockquote type="cite"><div class="gmail_extra">
<br><br><div class="gmail_quote">On Tue, Oct 22, 2013 at 3:50 PM, David Peixotto <span dir="ltr"><<a href="mailto:dpeixott@codeaurora.org" target="_blank">dpeixott@codeaurora.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I've attached a patch that removes the class in place of inline<br>
conditionals. Please help to review this patch.<br>
<div class="im"><br>
-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted<br>
by The Linux Foundation<br>
<br>
<br>
<br>
> -----Original Message-----<br>
> From: Bob Wilson [mailto:<a href="mailto:bob.wilson@apple.com">bob.wilson@apple.com</a>]<br>
</div><div class="im">> Sent: Tuesday, October 22, 2013 9:15 AM<br>
> To: David Peixotto<br>
> Cc: <a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>
> Subject: Re: [llvm] r192915 - Refactor lowering for COPY_STRUCT_BYVAL_I32<br>
><br>
</div>> Thanks. At least try it out.. If it turns out to be unreadable with all<br>
<div><div class="h5">the<br>
> conditionals, we can reconsider.<br>
><br>
> On Oct 22, 2013, at 9:13 AM, David Peixotto <<a href="mailto:dpeixott@codeaurora.org">dpeixott@codeaurora.org</a>><br>
> wrote:<br>
><br>
> > Ok, I will make a patch to switch to using inline conditionals instead.<br>
> ><br>
> > -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,<br>
> > hosted by The Linux Foundation<br>
> ><br>
> ><br>
> ><br>
> >> -----Original Message-----<br>
> >> From: Bob Wilson [mailto:<a href="mailto:bob.wilson@apple.com">bob.wilson@apple.com</a>]<br>
> >> Sent: Monday, October 21, 2013 9:18 PM<br>
> >> To: David Peixotto<br>
> >> Cc: <a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a> LLVM<br>
> >> Subject: Re: [llvm] r192915 - Refactor lowering for<br>
> >> COPY_STRUCT_BYVAL_I32<br>
> >><br>
> >><br>
> >> On Oct 21, 2013, at 7:20 PM, David Peixotto <<a href="mailto:dpeixott@codeaurora.org">dpeixott@codeaurora.org</a>><br>
> >> wrote:<br>
> >><br>
> >>> Hi Bob,<br>
> >>><br>
> >>> I agree that a generic emitter would be useful, but I'm not sure I<br>
> >>> would get the time to work on such a project at this point.<br>
> >><br>
> >> In that case, I think it would be better to simplify this code to use<br>
> >> a<br>
> > more<br>
> >> direct approach with conditionals.<br>
> >><br>
> >>><br>
> >>> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora<br>
> >>> Forum, hosted by The Linux Foundation<br>
> >>><br>
> >>><br>
> >>>> -----Original Message-----<br>
> >>>> From: Bob Wilson [mailto:<a href="mailto:bob.wilson@apple.com">bob.wilson@apple.com</a>]<br>
> >>>> Sent: Monday, October 21, 2013 12:12 PM<br>
> >>>> To: David Peixotto<br>
> >>>> Cc: <a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>
> >>>> Subject: Re: [llvm] r192915 - Refactor lowering for<br>
> >>>> COPY_STRUCT_BYVAL_I32<br>
> >>>><br>
> >>>> It seems like it make more sense to make the StructByvalEmitter a<br>
> >>>> generic "emitter" for ARM instructions. I suspect there are a<br>
> >>>> number of other<br>
> >>> places<br>
> >>>> in ARMISelLowering that could make good use of it. Would you be<br>
> >>>> willing<br>
> >>> to<br>
> >>>> investigate that?<br>
> >>>><br>
> >>>> As it stands, this really does seem like overkill for the struct<br>
> >>>> byval<br>
> >>> issue, but if<br>
> >>>> you could make it more generally useful, then the extra code would<br>
> >>>> be worthwhile.<br>
> >>>><br>
> >>>> On Oct 21, 2013, at 10:02 AM, David Peixotto<br>
> >>>> <<a href="mailto:dpeixott@codeaurora.org">dpeixott@codeaurora.org</a>><br>
> >>>> wrote:<br>
> >>>><br>
> >>>>> Hi Bob,<br>
> >>>>><br>
> >>>>> I think your criticism is valid here. I wasn't too happy with how<br>
> >>>>> much code I ended up writing for this change. When I started I<br>
> >>>>> thought the code size would be about equal after implementing the<br>
> >>>>> thumb1 lowering because I was getting rid of some code<br>
> >>>>> duplication, but the code size for abstracting the common parts<br>
> >>>>> was larger than I<br>
> > had<br>
> >> anticipated.<br>
> >>>>> There is no fundamental problem with implementing it with<br>
> >>>>> conditionals, I did it this way because I thought it would be<br>
> >>>>> clearer<br>
> >>> and<br>
> >>>> would be easier to write correctly.<br>
> >>>>><br>
> >>>>> I think the way it is now has the advantage that the lowering<br>
> >>>>> algorithm is clearly separated from the details of generating<br>
> >>>>> machine instructions for each sub-target. I think it would be<br>
> >>>>> easier to improve the algorithm with the way it is now. For<br>
> >>>>> example, we are always using byte stores to copy any leftover that<br>
> >>>>> does not fit into the "unit" size. So if we have a 31-byte struct<br>
> >>>>> and a target that supports neon we will generate a 16-byte store<br>
> >>>>> and<br>
> >>>>> 15 1-byte stores. We could improve this by generating fewer stores<br>
> >>>>> for the leftover (3x4-byte + 1x2byte + 1x1byte). I don't know if<br>
> >>>>> we actually care about this kind of change, but I believe it would<br>
> >>>>> be<br>
> >>> easier to<br>
> >>>> make now.<br>
> >>>>><br>
> >>>>> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora<br>
> >>>>> Forum, hosted by The Linux Foundation<br>
> >>>>><br>
> >>>>><br>
> >>>>>> -----Original Message-----<br>
> >>>>>> From: Bob Wilson [mailto:<a href="mailto:bob.wilson@apple.com">bob.wilson@apple.com</a>]<br>
> >>>>>> Sent: Saturday, October 19, 2013 7:00 PM<br>
> >>>>>> To: David Peixotto<br>
> >>>>>> Cc: <a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>
> >>>>>> Subject: Re: [llvm] r192915 - Refactor lowering for<br>
> >>>>>> COPY_STRUCT_BYVAL_I32<br>
> >>>>>><br>
> >>>>>> This is very nice and elegant, but it's an awful lot of code for<br>
> >>>>>> something<br>
> >>>>> that<br>
> >>>>>> isn't really that complicated. It seems like overkill to me.<br>
> >>>>>> Did you<br>
> >>>>> consider<br>
> >>>>>> implementing the Thumb1 support by just adding more<br>
> conditionals?<br>
> >>>>>> Is there a fundamental problem with that?<br>
> >>>>>><br>
> >>>>>> On Oct 17, 2013, at 12:49 PM, David Peixotto<br>
> >>>>>> <<a href="mailto:dpeixott@codeaurora.org">dpeixott@codeaurora.org</a>><br>
> >>>>>> wrote:<br>
> >>>>>><br>
> >>>>>>> Author: dpeixott<br>
> >>>>>>> Date: Thu Oct 17 14:49:22 2013<br>
> >>>>>>> New Revision: 192915<br>
> >>>>>>><br>
> >>>>>>> URL: <a href="http://llvm.org/viewvc/llvm-project?rev=192915&view=rev" target="_blank">http://llvm.org/viewvc/llvm-project?rev=192915&view=rev</a><br>
> >>>>>>> Log:<br>
> >>>>>>> Refactor lowering for COPY_STRUCT_BYVAL_I32<br>
> >>>>>>><br>
> >>>>>>> This commit refactors the lowering of the<br>
> COPY_STRUCT_BYVAL_I32<br>
> >>>>>>> pseudo-instruction in the ARM backend. We introduce a new<br>
> helper<br>
> >>>>>>> class that encapsulates all of the operations needed during the<br>
> >>> lowering.<br>
> >>>>>>> The operations are implemented for each subtarget in different<br>
> >>>>>>> subclasses. Currently only arm and thumb2 subtargets are<br>
> supported.<br>
> >>>>>>><br>
> >>>>>>> This refactoring was done to easily implement support for thumb1<br>
> >>>>>>> subtargets. This initial patch does not add support for thumb1,<br>
> >>>>>>> but is only a refactoring. A follow on patch will implement the<br>
> >>>>>>> support for<br>
> >>>>>>> thumb1 subtargets.<br>
> >>>>>>><br>
> >>>>>>> No intended functionality change.<br>
> >>>>>>><br>
> >>>>>>> Modified:<br>
> >>>>>>> llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp<br>
> >>>>>>><br>
> >>>>>>> Modified: llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp<br>
> >>>>>>> URL:<br>
> >>>>>>> <a href="http://llvm.org/viewvc/llvm-" target="_blank">http://llvm.org/viewvc/llvm-</a><br>
> >>>> project/llvm/trunk/lib/Target/ARM/ARMISe<br>
> >>>>>>> lL owering.cpp?rev=192915&r1=192914&r2=192915&view=diff<br>
> >>>>>>><br>
> >>>>>><br>
> >>>><br>
> >><br>
> ==========================================================<br>
> >>>>>> ============<br>
> >>>>>>> ========<br>
> >>>>>>> --- llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp (original)<br>
> >>>>>>> +++ llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp Thu Oct 17<br>
> >>>>>>> +++ 14:49:22<br>
> >>>>>>> +++ 2013<br>
> >>>>>>> @@ -48,6 +48,7 @@<br>
> >>>>>>> #include "llvm/Support/MathExtras.h"<br>
> >>>>>>> #include "llvm/Support/raw_ostream.h"<br>
> >>>>>>> #include "llvm/Target/TargetOptions.h"<br>
> >>>>>>> +#include <utility><br>
> >>>>>>> using namespace llvm;<br>
> >>>>>>><br>
> >>>>>>> STATISTIC(NumTailCalls, "Number of tail calls"); @@ -7245,8<br>
> >>>>>>> +7246,430 @@ MachineBasicBlock *OtherSucc(MachineBasi<br>
> >>>>>>> llvm_unreachable("Expecting a BB with two successors!"); }<br>
> >>>>>>><br>
> >>>>>>> -MachineBasicBlock *ARMTargetLowering::<br>
> >>>>>>> -EmitStructByval(MachineInstr *MI, MachineBasicBlock *BB) const<br>
> >>>>>>> {<br>
> >>>>>>> +namespace {<br>
> >>>>>>> +// This class is a helper for lowering the<br>
</div></div>> >>>>>>> +COPY_STRUCT_BYVAL_I32<br>
<div class="im">> >>>>>> instruction.<br>
> >>>>>>> +// It defines the operations needed to lower the byval copy. We<br>
> >>>>>>> +use a helper // class because the opcodes and machine<br>
> >>>>>>> +instructions are different for each // subtarget, but the<br>
</div>> >>>>>>> +overall algorithm for the lowering is the same. The //<br>
> >>>>>>> +implementation of each operation will be defined separately for<br>
> >>>>>>> +arm, thumb1, // and<br>
<div class="im">> >>>>>>> +thumb2 targets by subclassing this base class. See //<br>
> >>>>> ARMTargetLowering::EmitStructByval()<br>
> >>>>>> for how these operations are used.<br>
> >>>>>>> +class TargetStructByvalEmitter {<br>
> >>>>>>> +public:<br>
> >>>>>>> + TargetStructByvalEmitter(const TargetInstrInfo *TII_,<br>
> >>>>>>> + MachineRegisterInfo &MRI_,<br>
> >>>>>>> + const TargetRegisterClass *TRC_)<br>
> >>>>>>> + : TII(TII_), MRI(MRI_), TRC(TRC_) {}<br>
> >>>>>>> +<br>
> >>>>>>> + // Emit a post-increment load of "unit" size. The unit size<br>
</div>> >>>>>>> + is based on the // alignment of the struct being copied (4,<br>
<div class="im">> >>>>>>> + 2, or<br>
> >>>>>>> + 1 bytes). Alignments higher // than 4 are handled separately<br>
</div>> >>>>>>> + by using<br>
<div class="im">> >>>>>> NEON instructions.<br>
> >>>>>>> + //<br>
> >>>>>>> + // \param baseReg the register holding the address to load.<br>
> >>>>>>> + // \param baseOut the register to recieve the incremented<br>
> >> address.<br>
> >>>>>>> + // \returns the register holding the loaded value.<br>
> >>>>>>> + virtual unsigned emitUnitLoad(MachineBasicBlock *BB,<br>
> >>>>>>> + MachineInstr<br>
> >>>>>> *MI,<br>
> >>>>>>> + DebugLoc &dl, unsigned baseReg,<br>
> >>>>>>> + unsigned baseOut) = 0;<br>
> >>>>>>> +<br>
> >>>>>>> + // Emit a post-increment store of "unit" size. The unit size<br>
</div>> >>>>>>> + is based on the // alignment of the struct being copied (4,<br>
<div class="im">> >>>>>>> + 2, or<br>
> >>>>>>> + 1 bytes). Alignments higher // than 4 are handled separately<br>
</div>> >>>>>>> + by using<br>
<div class="im">> >>>>>> NEON instructions.<br>
> >>>>>>> + //<br>
> >>>>>>> + // \param baseReg the register holding the address to store.<br>
> >>>>>>> + // \param storeReg the register holding the value to store.<br>
> >>>>>>> + // \param baseOut the register to recieve the incremented<br>
> >> address.<br>
> >>>>>>> + virtual void emitUnitStore(MachineBasicBlock *BB,<br>
</div><div class="im">> >>>>>>> + MachineInstr<br>
> >> *MI,<br>
> >>>>>>> + DebugLoc &dl, unsigned baseReg,<br>
> >>>>>>> + unsigned<br>
</div><div class="im">> >>>>> storeReg,<br>
> >>>>>>> + unsigned baseOut) = 0;<br>
> >>>>>>> +<br>
> >>>>>>> + // Emit a post-increment load of one byte.<br>
> >>>>>>> + //<br>
> >>>>>>> + // \param baseReg the register holding the address to load.<br>
> >>>>>>> + // \param baseOut the register to recieve the incremented<br>
> >> address.<br>
> >>>>>>> + // \returns the register holding the loaded value.<br>
> >>>>>>> + virtual unsigned emitByteLoad(MachineBasicBlock *BB,<br>
> >>>>>>> + MachineInstr<br>
> >>>>>> *MI,<br>
> >>>>>>> + DebugLoc &dl, unsigned baseReg,<br>
> >>>>>>> + unsigned baseOut) = 0;<br>
> >>>>>>> +<br>
> >>>>>>> + // Emit a post-increment store of one byte.<br>
> >>>>>>> + //<br>
> >>>>>>> + // \param baseReg the register holding the address to store.<br>
> >>>>>>> + // \param storeReg the register holding the value to store.<br>
> >>>>>>> + // \param baseOut the register to recieve the incremented<br>
> >> address.<br>
> >>>>>>> + virtual void emitByteStore(MachineBasicBlock *BB,<br>
</div><div class="im">> >>>>>>> + MachineInstr<br>
> >> *MI,<br>
> >>>>>>> + DebugLoc &dl, unsigned baseReg,<br>
> >>>>>>> + unsigned<br>
</div><div><div class="h5">> >>>>> storeReg,<br>
> >>>>>>> + unsigned baseOut) = 0;<br>
> >>>>>>> +<br>
> >>>>>>> + // Emit a load of a constant value.<br>
> >>>>>>> + //<br>
> >>>>>>> + // \param Constant the register holding the address to store.<br>
> >>>>>>> + // \returns the register holding the loaded value.<br>
> >>>>>>> + virtual unsigned emitConstantLoad(MachineBasicBlock *BB,<br>
> >>>>>> MachineInstr *MI,<br>
> >>>>>>> + DebugLoc &dl, unsigned<br>
> > Constant,<br>
> >>>>>>> + const DataLayout *DL) = 0;<br>
> >>>>>>> +<br>
> >>>>>>> + // Emit a subtract of a register minus immediate, with the<br>
> >>>>>>> + immediate equal to // the "unit" size. The unit size is based<br>
> >>>>>>> + on the alignment of the struct // being copied (16, 8, 4, 2,<br>
> >>>>>>> + or<br>
> >>>>>>> + 1<br>
> >>>>> bytes).<br>
> >>>>>>> + //<br>
> >>>>>>> + // \param InReg the register holding the initial value.<br>
> >>>>>>> + // \param OutReg the register to recieve the subtracted value.<br>
> >>>>>>> + virtual void emitSubImm(MachineBasicBlock *BB, MachineInstr<br>
> >>>>>>> + *MI,<br>
> >>>>>> DebugLoc &dl,<br>
> >>>>>>> + unsigned InReg, unsigned OutReg) = 0;<br>
> >>>>>>> +<br>
> >>>>>>> + // Emit a branch based on a condition code of not equal.<br>
> >>>>>>> + //<br>
> >>>>>>> + // \param TargetBB the destination of the branch.<br>
> >>>>>>> + virtual void emitBranchNE(MachineBasicBlock *BB, MachineInstr<br>
> >> *MI,<br>
> >>>>>>> + DebugLoc &dl, MachineBasicBlock<br>
> >>>>>>> + *TargetBB) = 0;<br>
> >>>>>>> +<br>
> >>>>>>> + // Find the constant pool index for the given constant. This<br>
> >>>>>>> + method is // implemented in the base class because it is the<br>
> >>>>>>> + same for all<br>
> >>>>>> subtargets.<br>
> >>>>>>> + //<br>
> >>>>>>> + // \param LoopSize the constant value for which the index<br>
> >>>>>>> + should be<br>
> >>>>>> returned.<br>
> >>>>>>> + // \returns the constant pool index for the constant.<br>
> >>>>>>> + unsigned getConstantPoolIndex(MachineFunction *MF, const<br>
> >>>>>> DataLayout *DL,<br>
> >>>>>>> + unsigned LoopSize) {<br>
> >>>>>>> + MachineConstantPool *ConstantPool = MF->getConstantPool();<br>
> >>>>>>> + Type *Int32Ty = Type::getInt32Ty(MF->getFunction()-<br>
> >>>>> getContext());<br>
> >>>>>>> + const Constant *C = ConstantInt::get(Int32Ty, LoopSize);<br>
> >>>>>>> +<br>
> >>>>>>> + // MachineConstantPool wants an explicit alignment.<br>
> >>>>>>> + unsigned Align = DL->getPrefTypeAlignment(Int32Ty);<br>
> >>>>>>> + if (Align == 0)<br>
> >>>>>>> + Align = DL->getTypeAllocSize(C->getType());<br>
> >>>>>>> + return ConstantPool->getConstantPoolIndex(C, Align); }<br>
> >>>>>>> +<br>
> >>>>>>> + // Return the register class used by the subtarget.<br>
> >>>>>>> + //<br>
> >>>>>>> + // \returns the target register class.<br>
> >>>>>>> + const TargetRegisterClass *getTRC() const { return TRC; }<br>
> >>>>>>> +<br>
> >>>>>>> + virtual ~TargetStructByvalEmitter() {};<br>
> >>>>>>> +<br>
> >>>>>>> +protected:<br>
> >>>>>>> + const TargetInstrInfo *TII;<br>
> >>>>>>> + MachineRegisterInfo &MRI;<br>
> >>>>>>> + const TargetRegisterClass *TRC; };<br>
> >>>>>>> +<br>
> >>>>>>> +class ARMStructByvalEmitter : public TargetStructByvalEmitter {<br>
> >>>>>>> +public:<br>
> >>>>>>> + ARMStructByvalEmitter(const TargetInstrInfo *TII,<br>
> >>>>>>> +MachineRegisterInfo<br>
> >>>>>> &MRI,<br>
> >>>>>>> + unsigned LoadStoreSize)<br>
> >>>>>>> + : TargetStructByvalEmitter(<br>
> >>>>>>> + TII, MRI, (const TargetRegisterClass<br>
> >>> *)&ARM::GPRRegClass),<br>
> >>>>>>> + UnitSize(LoadStoreSize),<br>
> >>>>>>> + UnitLdOpc(LoadStoreSize == 4<br>
> >>>>>>> + ? ARM::LDR_POST_IMM<br>
> >>>>>>> + : LoadStoreSize == 2<br>
> >>>>>>> + ? ARM::LDRH_POST<br>
> >>>>>>> + : LoadStoreSize == 1 ?<br>
> >>>>>>> + ARM::LDRB_POST_IMM<br>
> >>> :<br>
> >>>>> 0),<br>
> >>>>>>> + UnitStOpc(LoadStoreSize == 4<br>
> >>>>>>> + ? ARM::STR_POST_IMM<br>
> >>>>>>> + : LoadStoreSize == 2<br>
> >>>>>>> + ? ARM::STRH_POST<br>
> >>>>>>> + : LoadStoreSize == 1 ?<br>
> >>>>>>> +ARM::STRB_POST_IMM<br>
> >>>>>>> +: 0) {}<br>
> >>>>>>> +<br>
> >>>>>>> + unsigned emitUnitLoad(MachineBasicBlock *BB, MachineInstr<br>
> >>>>>>> + *MI,<br>
> >>>>>> DebugLoc &dl,<br>
</div></div><div class="im">> >>>>>>> + unsigned baseReg, unsigned baseOut) {<br>
> >>>>>>> + unsigned scratch = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(UnitLdOpc),<br>
> >>>>>> scratch).addReg(<br>
> >>>>>>> + baseOut,<br>
> >>>>>> RegState::Define).addReg(baseReg).addReg(0).addImm(UnitSize));<br>
> >>>>>>> + return scratch;<br>
> >>>>>>> + }<br>
> >>>>>>> +<br>
> >>>>>>> + void emitUnitStore(MachineBasicBlock *BB, MachineInstr *MI,<br>
> >>>>>> DebugLoc &dl,<br>
> >>>>>>> + unsigned baseReg, unsigned storeReg,<br>
> >>>>>>> + unsigned<br>
> >>>>> baseOut) {<br>
> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(UnitStOpc),<br>
> >>>>>> baseOut).addReg(<br>
> >>>>>>> + storeReg).addReg(baseReg).addReg(0).addImm(UnitSize));<br>
> >>>>>>> + }<br>
> >>>>>>> +<br>
> >>>>>>> + unsigned emitByteLoad(MachineBasicBlock *BB, MachineInstr<br>
</div>> >>>>>>> + *MI,<br>
> >>>>>> DebugLoc &dl,<br>
<div><div class="h5">> >>>>>>> + unsigned baseReg, unsigned baseOut) {<br>
> >>>>>>> + unsigned scratch = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl,<br>
> >>>>>>> + TII->get(ARM::LDRB_POST_IMM),<br>
> >>>>>> scratch)<br>
> >>>>>>> + .addReg(baseOut,<br>
> >>>>> RegState::Define).addReg(baseReg)<br>
> >>>>>>> + .addReg(0).addImm(1));<br>
> >>>>>>> + return scratch;<br>
> >>>>>>> + }<br>
> >>>>>>> +<br>
> >>>>>>> + void emitByteStore(MachineBasicBlock *BB, MachineInstr *MI,<br>
> >>>>>> DebugLoc &dl,<br>
> >>>>>>> + unsigned baseReg, unsigned storeReg,<br>
> >>>>>>> + unsigned<br>
> >>>>> baseOut) {<br>
> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl,<br>
> >>>>>>> + TII->get(ARM::STRB_POST_IMM),<br>
> >>>>>> baseOut)<br>
> >>>>>>> +<br>
> >>>>>>> + .addReg(storeReg).addReg(baseReg).addReg(0).addImm(1));<br>
> >>>>>>> + }<br>
> >>>>>>> +<br>
> >>>>>>> + unsigned emitConstantLoad(MachineBasicBlock *BB,<br>
> MachineInstr<br>
> >>>> *MI,<br>
> >>>>>>> + DebugLoc &dl, unsigned Constant,<br>
> >>>>>>> + const DataLayout *DL) {<br>
> >>>>>>> + unsigned constReg = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> + unsigned Idx = getConstantPoolIndex(BB->getParent(), DL,<br>
> >>>> Constant);<br>
> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl, TII-<br>
> >>> get(ARM::LDRcp)).addReg(<br>
> >>>>>>> + constReg,<br>
> >>>>> RegState::Define).addConstantPoolIndex(Idx).addImm(0));<br>
> >>>>>>> + return constReg;<br>
> >>>>>>> + }<br>
> >>>>>>> +<br>
> >>>>>>> + void emitSubImm(MachineBasicBlock *BB, MachineInstr *MI,<br>
> >>>> DebugLoc<br>
> >>>>>> &dl,<br>
> >>>>>>> + unsigned InReg, unsigned OutReg) {<br>
> >>>>>>> + MachineInstrBuilder MIB =<br>
> >>>>>>> + BuildMI(*BB, MI, dl, TII->get(ARM::SUBri), OutReg);<br>
> >>>>>>> +<br>
> >>>>>><br>
> >> AddDefaultCC(AddDefaultPred(MIB.addReg(InReg).addImm(UnitSize)));<br>
> >>>>>>> + MIB->getOperand(5).setReg(ARM::CPSR);<br>
> >>>>>>> + MIB->getOperand(5).setIsDef(true);<br>
> >>>>>>> + }<br>
> >>>>>>> +<br>
> >>>>>>> + void emitBranchNE(MachineBasicBlock *BB, MachineInstr *MI,<br>
> >>>>>> DebugLoc &dl,<br>
> >>>>>>> + MachineBasicBlock *TargetBB) {<br>
> >>>>>>> + BuildMI(*BB, MI, dl, TII-<br>
> >>>>>>> get(ARM::Bcc)).addMBB(TargetBB).addImm(ARMCC::NE)<br>
> >>>>>>> + .addReg(ARM::CPSR);<br>
> >>>>>>> + }<br>
> >>>>>>> +<br>
> >>>>>>> +private:<br>
> >>>>>>> + const unsigned UnitSize;<br>
> >>>>>>> + const unsigned UnitLdOpc;<br>
> >>>>>>> + const unsigned UnitStOpc;<br>
> >>>>>>> +};<br>
> >>>>>>> +<br>
> >>>>>>> +class Thumb2StructByvalEmitter : public<br>
</div></div>> >>>>>>> +TargetStructByvalEmitter {<br>
<div class="im">> >>>>>>> +public:<br>
> >>>>>>> + Thumb2StructByvalEmitter(const TargetInstrInfo *TII,<br>
> >>>>>> MachineRegisterInfo &MRI,<br>
> >>>>>>> + unsigned LoadStoreSize)<br>
> >>>>>>> + : TargetStructByvalEmitter(<br>
> >>>>>>> + TII, MRI, (const TargetRegisterClass<br>
> >>> *)&ARM::tGPRRegClass),<br>
> >>>>>>> + UnitSize(LoadStoreSize),<br>
> >>>>>>> + UnitLdOpc(LoadStoreSize == 4<br>
> >>>>>>> + ? ARM::t2LDR_POST<br>
> >>>>>>> + : LoadStoreSize == 2<br>
> >>>>>>> + ? ARM::t2LDRH_POST<br>
> >>>>>>> + : LoadStoreSize == 1 ?<br>
> >>>>>>> + ARM::t2LDRB_POST<br>
> > :<br>
> >>>>> 0),<br>
> >>>>>>> + UnitStOpc(LoadStoreSize == 4<br>
> >>>>>>> + ? ARM::t2STR_POST<br>
> >>>>>>> + : LoadStoreSize == 2<br>
> >>>>>>> + ? ARM::t2STRH_POST<br>
> >>>>>>> + : LoadStoreSize == 1 ?<br>
> >>>>>>> + ARM::t2STRB_POST<br>
> > :<br>
> >>>>>>> +0) {}<br>
> >>>>>>> +<br>
> >>>>>>> + unsigned emitUnitLoad(MachineBasicBlock *BB, MachineInstr<br>
</div>> >>>>>>> + *MI,<br>
> >>>>>> DebugLoc &dl,<br>
<div class="im">> >>>>>>> + unsigned baseReg, unsigned baseOut) {<br>
> >>>>>>> + unsigned scratch = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(UnitLdOpc),<br>
> >>>>>> scratch).addReg(<br>
> >>>>>>> + baseOut,<br>
> >> RegState::Define).addReg(baseReg).addImm(UnitSize));<br>
> >>>>>>> + return scratch;<br>
> >>>>>>> + }<br>
> >>>>>>> +<br>
> >>>>>>> + void emitUnitStore(MachineBasicBlock *BB, MachineInstr *MI,<br>
> >>>>>> DebugLoc &dl,<br>
> >>>>>>> + unsigned baseReg, unsigned storeReg,<br>
> >>>>>>> + unsigned<br>
> >>>>> baseOut) {<br>
> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(UnitStOpc),<br>
> >>>>>>> + baseOut)<br>
> >>>>>>> +<br>
> >>>>>>> + .addReg(storeReg).addReg(baseReg).addImm(UnitSize));<br>
> >>>>>>> + }<br>
> >>>>>>> +<br>
> >>>>>>> + unsigned emitByteLoad(MachineBasicBlock *BB, MachineInstr<br>
</div>> >>>>>>> + *MI,<br>
> >>>>>> DebugLoc &dl,<br>
<div><div class="h5">> >>>>>>> + unsigned baseReg, unsigned baseOut) {<br>
> >>>>>>> + unsigned scratch = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl,<br>
> >>>>>>> + TII->get(ARM::t2LDRB_POST),<br>
> >>>>>> scratch)<br>
> >>>>>>> + .addReg(baseOut,<br>
> >>>>> RegState::Define).addReg(baseReg)<br>
> >>>>>>> + .addImm(1));<br>
> >>>>>>> + return scratch;<br>
> >>>>>>> + }<br>
> >>>>>>> +<br>
> >>>>>>> + void emitByteStore(MachineBasicBlock *BB, MachineInstr *MI,<br>
> >>>>>> DebugLoc &dl,<br>
> >>>>>>> + unsigned baseReg, unsigned storeReg,<br>
> >>>>>>> + unsigned<br>
> >>>>> baseOut) {<br>
> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl,<br>
> >>>>>>> + TII->get(ARM::t2STRB_POST),<br>
> >>>>>> baseOut)<br>
> >>>>>>> +<br>
> >>>>>>> + .addReg(storeReg).addReg(baseReg).addImm(1));<br>
> >>>>>>> + }<br>
> >>>>>>> +<br>
> >>>>>>> + unsigned emitConstantLoad(MachineBasicBlock *BB,<br>
> MachineInstr<br>
> >>>> *MI,<br>
> >>>>>>> + DebugLoc &dl, unsigned Constant,<br>
> >>>>>>> + const DataLayout *DL) {<br>
> >>>>>>> + unsigned VConst = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> + unsigned Vtmp = VConst;<br>
> >>>>>>> + if ((Constant & 0xFFFF0000) != 0)<br>
> >>>>>>> + Vtmp = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> + AddDefaultPred(BuildMI(BB, dl, TII->get(ARM::t2MOVi16),<br>
> Vtmp)<br>
> >>>>>>> + .addImm(Constant & 0xFFFF));<br>
> >>>>>>> +<br>
> >>>>>>> + if ((Constant & 0xFFFF0000) != 0)<br>
> >>>>>>> + AddDefaultPred(BuildMI(BB, dl, TII->get(ARM::t2MOVTi16),<br>
> >>> VConst)<br>
> >>>>>>> + .addReg(Vtmp).addImm(Constant >> 16));<br>
> >>>>>>> + return VConst;<br>
> >>>>>>> + }<br>
> >>>>>>> +<br>
> >>>>>>> + void emitSubImm(MachineBasicBlock *BB, MachineInstr *MI,<br>
> >>>> DebugLoc<br>
> >>>>>> &dl,<br>
> >>>>>>> + unsigned InReg, unsigned OutReg) {<br>
> >>>>>>> + MachineInstrBuilder MIB =<br>
> >>>>>>> + BuildMI(*BB, MI, dl, TII->get(ARM::t2SUBri), OutReg);<br>
> >>>>>>> +<br>
> >>>>>><br>
> >> AddDefaultCC(AddDefaultPred(MIB.addReg(InReg).addImm(UnitSize)));<br>
> >>>>>>> + MIB->getOperand(5).setReg(ARM::CPSR);<br>
> >>>>>>> + MIB->getOperand(5).setIsDef(true);<br>
> >>>>>>> + }<br>
> >>>>>>> +<br>
> >>>>>>> + void emitBranchNE(MachineBasicBlock *BB, MachineInstr *MI,<br>
> >>>>>> DebugLoc &dl,<br>
> >>>>>>> + MachineBasicBlock *TargetBB) {<br>
> >>>>>>> + BuildMI(BB, dl, TII-<br>
> >>>>>>> get(ARM::t2Bcc)).addMBB(TargetBB).addImm(ARMCC::NE)<br>
> >>>>>>> + .addReg(ARM::CPSR);<br>
> >>>>>>> + }<br>
> >>>>>>> +<br>
> >>>>>>> +private:<br>
> >>>>>>> + const unsigned UnitSize;<br>
> >>>>>>> + const unsigned UnitLdOpc;<br>
> >>>>>>> + const unsigned UnitStOpc;<br>
> >>>>>>> +};<br>
> >>>>>>> +<br>
> >>>>>>> +// This class is a thin wrapper that delegates most of the work<br>
> >>>>>>> +to the correct // TargetStructByvalEmitter implementation. It<br>
> >>>>>>> +also handles the lowering for // targets that support neon<br>
> >>>>>>> +because the neon implementation is the same for all // targets<br>
</div></div>> >>>>>>> +that<br>
<div><div class="h5">> >> support it.<br>
> >>>>>>> +class StructByvalEmitter {<br>
> >>>>>>> +public:<br>
> >>>>>>> + StructByvalEmitter(unsigned LoadStoreSize, const ARMSubtarget<br>
> >>>>>> *Subtarget,<br>
> >>>>>>> + const TargetInstrInfo *TII_,<br>
> >>>>>>> + MachineRegisterInfo<br>
> >>>>> &MRI_,<br>
> >>>>>>> + const DataLayout *DL_)<br>
> >>>>>>> + : UnitSize(LoadStoreSize),<br>
> >>>>>>> + TargetEmitter(<br>
> >>>>>>> + Subtarget->isThumb2()<br>
> >>>>>>> + ? static_cast<TargetStructByvalEmitter *>(<br>
> >>>>>>> + new Thumb2StructByvalEmitter(TII_, MRI_,<br>
> >>>>>>> + LoadStoreSize))<br>
> >>>>>>> + : static_cast<TargetStructByvalEmitter *>(<br>
> >>>>>>> + new ARMStructByvalEmitter(TII_, MRI_,<br>
> >>>>>>> + LoadStoreSize))),<br>
> >>>>>>> + TII(TII_), MRI(MRI_), DL(DL_),<br>
> >>>>>>> + VecTRC(UnitSize == 16<br>
> >>>>>>> + ? (const TargetRegisterClass<br>
> > *)&ARM::DPairRegClass<br>
> >>>>>>> + : UnitSize == 8<br>
> >>>>>>> + ? (const TargetRegisterClass<br>
> >>>>> *)&ARM::DPRRegClass<br>
> >>>>>>> + : 0),<br>
> >>>>>>> + VecLdOpc(UnitSize == 16 ? ARM::VLD1q32wb_fixed<br>
> >>>>>>> + : UnitSize == 8 ?<br>
> >>>>>>> + ARM::VLD1d32wb_fixed<br>
> >>>>> : 0),<br>
> >>>>>>> + VecStOpc(UnitSize == 16 ? ARM::VST1q32wb_fixed<br>
> >>>>>>> + : UnitSize == 8 ?<br>
> >>>>>>> +ARM::VST1d32wb_fixed : 0) {}<br>
> >>>>>>> +<br>
> >>>>>>> + // Emit a post-increment load of "unit" size. The unit size<br>
</div></div>> >>>>>>> + is based on the // alignment of the struct being copied (16,<br>
> >>>>>>> + 8, 4, 2, or 1 bytes). Loads of 16 // or 8 bytes use NEON<br>
> >>>>>>> + instructions to load<br>
<div><div class="h5">> >>>>> the<br>
> >>>>>> value.<br>
> >>>>>>> + //<br>
> >>>>>>> + // \param baseReg the register holding the address to load.<br>
> >>>>>>> + // \param baseOut the register to recieve the incremented<br>
> >> address.<br>
> >>>>>>> + If baseOut // is 0 then a new register is created to hold the<br>
> >>>>> incremented<br>
> >>>>>> address.<br>
> >>>>>>> + // \returns a pair of registers holding the loaded value and<br>
> >>>>>>> + the updated // address.<br>
> >>>>>>> + std::pair<unsigned, unsigned> emitUnitLoad(MachineBasicBlock<br>
> >> *BB,<br>
> >>>>>>> + MachineInstr *MI,<br>
> >>>>>>> + DebugLoc<br>
> >>>>> &dl,<br>
> >>>>>>> + unsigned baseReg,<br>
> >>>>>>> + unsigned baseOut =<br>
> >>>>>>> + 0)<br>
> > {<br>
> >>>>>>> + unsigned scratch = 0;<br>
> >>>>>>> + if (baseOut == 0)<br>
> >>>>>>> + baseOut = MRI.createVirtualRegister(TargetEmitter-<br>
> >getTRC());<br>
> >>>>>>> + if (UnitSize >= 8) { // neon<br>
> >>>>>>> + scratch = MRI.createVirtualRegister(VecTRC);<br>
> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(VecLdOpc),<br>
> >>>>>> scratch).addReg(<br>
> >>>>>>> + baseOut, RegState::Define).addReg(baseReg).addImm(0));<br>
> >>>>>>> + } else {<br>
> >>>>>>> + scratch = TargetEmitter->emitUnitLoad(BB, MI, dl,<br>
> >>>>>>> + baseReg,<br>
> >>>>> baseOut);<br>
> >>>>>>> + }<br>
> >>>>>>> + return std::make_pair(scratch, baseOut); }<br>
> >>>>>>> +<br>
> >>>>>>> + // Emit a post-increment store of "unit" size. The unit size<br>
</div></div>> >>>>>>> + is based on the // alignment of the struct being copied (16,<br>
> >>>>>>> + 8, 4, 2, or 1 bytes). Stores of // 16 or 8 bytes use NEON<br>
<div class="im">> >>>>>>> + instructions to<br>
> >>>>> store the<br>
> >>>>>> value.<br>
> >>>>>>> + //<br>
> >>>>>>> + // \param baseReg the register holding the address to store.<br>
> >>>>>>> + // \param storeReg the register holding the value to store.<br>
> >>>>>>> + // \param baseOut the register to recieve the incremented<br>
> >> address.<br>
> >>>>>>> + If baseOut // is 0 then a new register is created to hold the<br>
> >>>>> incremented<br>
> >>>>>> address.<br>
> >>>>>>> + // \returns the register holding the updated address.<br>
> >>>>>>> + unsigned emitUnitStore(MachineBasicBlock *BB, MachineInstr<br>
</div>> >>>>>>> + *MI,<br>
> >>>>>> DebugLoc &dl,<br>
<div><div class="h5">> >>>>>>> + unsigned baseReg, unsigned storeReg,<br>
> >>>>>>> + unsigned baseOut = 0) {<br>
> >>>>>>> + if (baseOut == 0)<br>
> >>>>>>> + baseOut = MRI.createVirtualRegister(TargetEmitter-<br>
> >getTRC());<br>
> >>>>>>> + if (UnitSize >= 8) { // neon<br>
> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(VecStOpc),<br>
> >>> baseOut)<br>
> >>>>>>> +<br>
> >>> .addReg(baseReg).addImm(0).addReg(storeReg));<br>
> >>>>>>> + } else {<br>
> >>>>>>> + TargetEmitter->emitUnitStore(BB, MI, dl, baseReg,<br>
> >>>>>>> + storeReg,<br>
> >>>>>> baseOut);<br>
> >>>>>>> + }<br>
> >>>>>>> + return baseOut;<br>
> >>>>>>> + }<br>
> >>>>>>> +<br>
> >>>>>>> + // Emit a post-increment load of one byte.<br>
> >>>>>>> + //<br>
> >>>>>>> + // \param baseReg the register holding the address to load.<br>
> >>>>>>> + // \returns a pair of registers holding the loaded value and<br>
> >>>>>>> + the updated // address.<br>
> >>>>>>> + std::pair<unsigned, unsigned> emitByteLoad(MachineBasicBlock<br>
> >> *BB,<br>
> >>>>>>> + MachineInstr *MI,<br>
> >>>>>>> + DebugLoc<br>
> >>>>> &dl,<br>
> >>>>>>> + unsigned baseReg) {<br>
> >>>>>>> + unsigned baseOut = MRI.createVirtualRegister(TargetEmitter-<br>
> >>>>>>> getTRC());<br>
> >>>>>>> + unsigned scratch =<br>
> >>>>>>> + TargetEmitter->emitByteLoad(BB, MI, dl, baseReg,<br>
baseOut);<br>
> >>>>>>> + return std::make_pair(scratch, baseOut); }<br>
> >>>>>>> +<br>
> >>>>>>> + // Emit a post-increment store of one byte.<br>
> >>>>>>> + //<br>
> >>>>>>> + // \param baseReg the register holding the address to store.<br>
> >>>>>>> + // \param storeReg the register holding the value to store.<br>
> >>>>>>> + // \returns the register holding the updated address.<br>
> >>>>>>> + unsigned emitByteStore(MachineBasicBlock *BB, MachineInstr<br>
> >> *MI,<br>
> >>>>>> DebugLoc &dl,<br>
> >>>>>>> + unsigned baseReg, unsigned storeReg) {<br>
> >>>>>>> + unsigned baseOut = MRI.createVirtualRegister(TargetEmitter-<br>
> >>>>>>> getTRC());<br>
> >>>>>>> + TargetEmitter->emitByteStore(BB, MI, dl, baseReg, storeReg,<br>
> >>>>> baseOut);<br>
> >>>>>>> + return baseOut;<br>
> >>>>>>> + }<br>
> >>>>>>> +<br>
> >>>>>>> + // Emit a load of the constant LoopSize.<br>
> >>>>>>> + //<br>
> >>>>>>> + // \param LoopSize the constant to load.<br>
> >>>>>>> + // \returns the register holding the loaded constant.<br>
> >>>>>>> + unsigned emitConstantLoad(MachineBasicBlock *BB,<br>
> MachineInstr<br>
> >>>> *MI,<br>
> >>>>>>> + DebugLoc &dl, unsigned LoopSize) {<br>
> >>>>>>> + return TargetEmitter->emitConstantLoad(BB, MI, dl,<br>
</div></div>> >>>>>>> + LoopSize, DL); }<br>
<div><div class="h5">> >>>>>>> +<br>
> >>>>>>> + // Emit a subtract of a register minus immediate, with the<br>
> >>>>>>> + immediate equal to // the "unit" size. The unit size is based<br>
> >>>>>>> + on the alignment of the struct // being copied (16, 8, 4, 2,<br>
> >>>>>>> + or<br>
> >>>>>>> + 1<br>
> >>>>> bytes).<br>
> >>>>>>> + //<br>
> >>>>>>> + // \param InReg the register holding the initial value.<br>
> >>>>>>> + // \param OutReg the register to recieve the subtracted value.<br>
> >>>>>>> + void emitSubImm(MachineBasicBlock *BB, MachineInstr *MI,<br>
> >>>> DebugLoc<br>
> >>>>>> &dl,<br>
> >>>>>>> + unsigned InReg, unsigned OutReg) {<br>
> >>>>>>> + TargetEmitter->emitSubImm(BB, MI, dl, InReg, OutReg); }<br>
> >>>>>>> +<br>
> >>>>>>> + // Emit a branch based on a condition code of not equal.<br>
> >>>>>>> + //<br>
> >>>>>>> + // \param TargetBB the destination of the branch.<br>
> >>>>>>> + void emitBranchNE(MachineBasicBlock *BB, MachineInstr *MI,<br>
> >>>>>> DebugLoc &dl,<br>
> >>>>>>> + MachineBasicBlock *TargetBB) {<br>
> >>>>>>> + TargetEmitter->emitBranchNE(BB, MI, dl, TargetBB); }<br>
> >>>>>>> +<br>
> >>>>>>> + // Return the register class used by the subtarget.<br>
> >>>>>>> + //<br>
> >>>>>>> + // \returns the target register class.<br>
> >>>>>>> + const TargetRegisterClass *getTRC() const { return<br>
> >>>>>>> + TargetEmitter->getTRC(); }<br>
> >>>>>>> +<br>
> >>>>>>> +private:<br>
> >>>>>>> + const unsigned UnitSize;<br>
> >>>>>>> + OwningPtr<TargetStructByvalEmitter> TargetEmitter;<br>
> >>>>>>> + const TargetInstrInfo *TII;<br>
> >>>>>>> + MachineRegisterInfo &MRI;<br>
> >>>>>>> + const DataLayout *DL;<br>
> >>>>>>> +<br>
> >>>>>>> + const TargetRegisterClass *VecTRC;<br>
> >>>>>>> + const unsigned VecLdOpc;<br>
> >>>>>>> + const unsigned VecStOpc;<br>
> >>>>>>> +};<br>
> >>>>>>> +}<br>
> >>>>>>> +<br>
> >>>>>>> +MachineBasicBlock *<br>
> >>>>>>> +ARMTargetLowering::EmitStructByval(MachineInstr *MI,<br>
> >>>>>>> + MachineBasicBlock *BB) const<br>
</div></div>> >>>>>>> +{<br>
<div class="HOEnZb"><div class="h5">> >>>>>>> // This pseudo instruction has 3 operands: dst, src, size // We<br>
> >>>>>>> expand it to a loop if size > Subtarget-<br>
> >>>>>>> getMaxInlineSizeThreshold().<br>
> >>>>>>> // Otherwise, we will generate unrolled scalar copies.<br>
> >>>>>>> @@ -7261,23 +7684,13 @@ EmitStructByval(MachineInstr *MI,<br>
> >> Machin<br>
> >>>>>>> unsigned Align = MI->getOperand(3).getImm(); DebugLoc dl =<br>
> >>>>>>> MI->getDebugLoc();<br>
> >>>>>>><br>
> >>>>>>> - bool isThumb2 = Subtarget->isThumb2(); MachineFunction *MF<br>
> =<br>
> >>>>>>> BB->getParent(); MachineRegisterInfo &MRI = MF->getRegInfo();<br>
> >>>>>>> - unsigned ldrOpc, strOpc, UnitSize = 0;<br>
> >>>>>>> -<br>
> >>>>>>> - const TargetRegisterClass *TRC = isThumb2 ?<br>
> >>>>>>> - (const TargetRegisterClass*)&ARM::tGPRRegClass :<br>
> >>>>>>> - (const TargetRegisterClass*)&ARM::GPRRegClass;<br>
> >>>>>>> - const TargetRegisterClass *TRC_Vec = 0;<br>
> >>>>>>> + unsigned UnitSize = 0;<br>
> >>>>>>><br>
> >>>>>>> if (Align & 1) {<br>
> >>>>>>> - ldrOpc = isThumb2 ? ARM::t2LDRB_POST :<br>
> ARM::LDRB_POST_IMM;<br>
> >>>>>>> - strOpc = isThumb2 ? ARM::t2STRB_POST :<br>
> ARM::STRB_POST_IMM;<br>
> >>>>>>> UnitSize = 1;<br>
> >>>>>>> } else if (Align & 2) {<br>
> >>>>>>> - ldrOpc = isThumb2 ? ARM::t2LDRH_POST : ARM::LDRH_POST;<br>
> >>>>>>> - strOpc = isThumb2 ? ARM::t2STRH_POST : ARM::STRH_POST;<br>
> >>>>>>> UnitSize = 2;<br>
> >>>>>>> } else {<br>
> >>>>>>> // Check whether we can use NEON instructions.<br>
> >>>>>>> @@ -7285,27 +7698,18 @@ EmitStructByval(MachineInstr *MI,<br>
> >> Machin<br>
> >>>>>>> hasAttribute(AttributeSet::FunctionIndex,<br>
> >>>>>>> Attribute::NoImplicitFloat) &&<br>
> >>>>>>> Subtarget->hasNEON()) {<br>
> >>>>>>> - if ((Align % 16 == 0) && SizeVal >= 16) {<br>
> >>>>>>> - ldrOpc = ARM::VLD1q32wb_fixed;<br>
> >>>>>>> - strOpc = ARM::VST1q32wb_fixed;<br>
> >>>>>>> + if ((Align % 16 == 0) && SizeVal >= 16)<br>
> >>>>>>> UnitSize = 16;<br>
> >>>>>>> - TRC_Vec = (const<br>
TargetRegisterClass*)&ARM::DPairRegClass;<br>
> >>>>>>> - }<br>
> >>>>>>> - else if ((Align % 8 == 0) && SizeVal >= 8) {<br>
> >>>>>>> - ldrOpc = ARM::VLD1d32wb_fixed;<br>
> >>>>>>> - strOpc = ARM::VST1d32wb_fixed;<br>
> >>>>>>> + else if ((Align % 8 == 0) && SizeVal >= 8)<br>
> >>>>>>> UnitSize = 8;<br>
> >>>>>>> - TRC_Vec = (const TargetRegisterClass*)&ARM::DPRRegClass;<br>
> >>>>>>> - }<br>
> >>>>>>> }<br>
> >>>>>>> // Can't use NEON instructions.<br>
> >>>>>>> - if (UnitSize == 0) {<br>
> >>>>>>> - ldrOpc = isThumb2 ? ARM::t2LDR_POST :<br>
> ARM::LDR_POST_IMM;<br>
> >>>>>>> - strOpc = isThumb2 ? ARM::t2STR_POST :<br>
> ARM::STR_POST_IMM;<br>
> >>>>>>> + if (UnitSize == 0)<br>
> >>>>>>> UnitSize = 4;<br>
> >>>>>>> - }<br>
> >>>>>>> }<br>
> >>>>>>><br>
> >>>>>>> + StructByvalEmitter ByvalEmitter(UnitSize, Subtarget, TII, MRI,<br>
> >>>>>>> + getDataLayout());<br>
> >>>>>>> unsigned BytesLeft = SizeVal % UnitSize; unsigned LoopSize =<br>
> >>>>>>> SizeVal - BytesLeft;<br>
> >>>>>>><br>
> >>>>>>> @@ -7316,67 +7720,22 @@ EmitStructByval(MachineInstr *MI,<br>
> >> Machin<br>
> >>>>>>> unsigned srcIn = src;<br>
> >>>>>>> unsigned destIn = dest;<br>
> >>>>>>> for (unsigned i = 0; i < LoopSize; i+=UnitSize) {<br>
> >>>>>>> - unsigned scratch = MRI.createVirtualRegister(UnitSize >= 8<br>
?<br>
> >>>>>> TRC_Vec:TRC);<br>
> >>>>>>> - unsigned srcOut = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> - unsigned destOut = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> - if (UnitSize >= 8) {<br>
> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl,<br>
> >>>>>>> - TII->get(ldrOpc), scratch)<br>
> >>>>>>> - .addReg(srcOut,<br>
> > RegState::Define).addReg(srcIn).addImm(0));<br>
> >>>>>>> -<br>
> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(strOpc),<br>
> >>> destOut)<br>
> >>>>>>> - .addReg(destIn).addImm(0).addReg(scratch));<br>
> >>>>>>> - } else if (isThumb2) {<br>
> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl,<br>
> >>>>>>> - TII->get(ldrOpc), scratch)<br>
> >>>>>>> - .addReg(srcOut,<br>
> >>>>> RegState::Define).addReg(srcIn).addImm(UnitSize));<br>
> >>>>>>> -<br>
> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(strOpc),<br>
> >>> destOut)<br>
> >>>>>>> - .addReg(scratch).addReg(destIn)<br>
> >>>>>>> - .addImm(UnitSize));<br>
> >>>>>>> - } else {<br>
> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl,<br>
> >>>>>>> - TII->get(ldrOpc), scratch)<br>
> >>>>>>> - .addReg(srcOut,<br>
RegState::Define).addReg(srcIn).addReg(0)<br>
> >>>>>>> - .addImm(UnitSize));<br>
> >>>>>>> -<br>
> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(strOpc),<br>
> >>> destOut)<br>
> >>>>>>> - .addReg(scratch).addReg(destIn)<br>
> >>>>>>> - .addReg(0).addImm(UnitSize));<br>
> >>>>>>> - }<br>
> >>>>>>> - srcIn = srcOut;<br>
> >>>>>>> - destIn = destOut;<br>
> >>>>>>> + std::pair<unsigned, unsigned> res =<br>
> >>>>>>> + ByvalEmitter.emitUnitLoad(BB, MI, dl, srcIn);<br>
> >>>>>>> + unsigned scratch = res.first;<br>
> >>>>>>> + srcIn = res.second;<br>
> >>>>>>> + destIn = ByvalEmitter.emitUnitStore(BB, MI, dl, destIn,<br>
> >>>>>>> + scratch);<br>
> >>>>>>> }<br>
> >>>>>>><br>
> >>>>>>> // Handle the leftover bytes with LDRB and STRB.<br>
> >>>>>>> // [scratch, srcOut] = LDRB_POST(srcIn, 1) // [destOut] =<br>
> >>>>>>> STRB_POST(scratch, destIn, 1)<br>
> >>>>>>> - ldrOpc = isThumb2 ? ARM::t2LDRB_POST :<br>
> ARM::LDRB_POST_IMM;<br>
> >>>>>>> - strOpc = isThumb2 ? ARM::t2STRB_POST :<br>
> ARM::STRB_POST_IMM;<br>
> >>>>>>> for (unsigned i = 0; i < BytesLeft; i++) {<br>
> >>>>>>> - unsigned scratch = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> - unsigned srcOut = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> - unsigned destOut = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> - if (isThumb2) {<br>
> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl,<br>
> >>>>>>> - TII->get(ldrOpc),scratch)<br>
> >>>>>>> - .addReg(srcOut,<br>
> > RegState::Define).addReg(srcIn).addImm(1));<br>
> >>>>>>> -<br>
> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(strOpc),<br>
> >>> destOut)<br>
> >>>>>>> - .addReg(scratch).addReg(destIn)<br>
> >>>>>>> - .addImm(1));<br>
> >>>>>>> - } else {<br>
> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl,<br>
> >>>>>>> - TII->get(ldrOpc),scratch)<br>
> >>>>>>> - .addReg(srcOut, RegState::Define).addReg(srcIn)<br>
> >>>>>>> - .addReg(0).addImm(1));<br>
> >>>>>>> -<br>
> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(strOpc),<br>
> >>> destOut)<br>
> >>>>>>> - .addReg(scratch).addReg(destIn)<br>
> >>>>>>> - .addReg(0).addImm(1));<br>
> >>>>>>> - }<br>
> >>>>>>> - srcIn = srcOut;<br>
> >>>>>>> - destIn = destOut;<br>
> >>>>>>> + std::pair<unsigned, unsigned> res =<br>
> >>>>>>> + ByvalEmitter.emitByteLoad(BB, MI, dl, srcIn);<br>
> >>>>>>> + unsigned scratch = res.first;<br>
> >>>>>>> + srcIn = res.second;<br>
> >>>>>>> + destIn = ByvalEmitter.emitByteStore(BB, MI, dl, destIn,<br>
> >>>>>>> + scratch);<br>
> >>>>>>> }<br>
> >>>>>>> MI->eraseFromParent(); // The instruction is gone now.<br>
> >>>>>>> return BB;<br>
> >>>>>>> @@ -7414,34 +7773,7 @@ EmitStructByval(MachineInstr *MI,<br>
> Machin<br>
> >>>>>>> exitMBB->transferSuccessorsAndUpdatePHIs(BB);<br>
> >>>>>>><br>
> >>>>>>> // Load an immediate to varEnd.<br>
> >>>>>>> - unsigned varEnd = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> - if (isThumb2) {<br>
> >>>>>>> - unsigned VReg1 = varEnd;<br>
> >>>>>>> - if ((LoopSize & 0xFFFF0000) != 0)<br>
> >>>>>>> - VReg1 = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(ARM::t2MOVi16),<br>
> VReg1)<br>
> >>>>>>> - .addImm(LoopSize & 0xFFFF));<br>
> >>>>>>> -<br>
> >>>>>>> - if ((LoopSize & 0xFFFF0000) != 0)<br>
> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(ARM::t2MOVTi16),<br>
> >>> varEnd)<br>
> >>>>>>> - .addReg(VReg1)<br>
> >>>>>>> - .addImm(LoopSize >> 16));<br>
> >>>>>>> - } else {<br>
> >>>>>>> - MachineConstantPool *ConstantPool = MF->getConstantPool();<br>
> >>>>>>> - Type *Int32Ty =<br>
> >>> Type::getInt32Ty(MF->getFunction()->getContext());<br>
> >>>>>>> - const Constant *C = ConstantInt::get(Int32Ty, LoopSize);<br>
> >>>>>>> -<br>
> >>>>>>> - // MachineConstantPool wants an explicit alignment.<br>
> >>>>>>> - unsigned Align = getDataLayout()-<br>
> >>> getPrefTypeAlignment(Int32Ty);<br>
> >>>>>>> - if (Align == 0)<br>
> >>>>>>> - Align = getDataLayout()->getTypeAllocSize(C->getType());<br>
> >>>>>>> - unsigned Idx = ConstantPool->getConstantPoolIndex(C, Align);<br>
> >>>>>>> -<br>
> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(ARM::LDRcp))<br>
> >>>>>>> - .addReg(varEnd, RegState::Define)<br>
> >>>>>>> - .addConstantPoolIndex(Idx)<br>
> >>>>>>> - .addImm(0));<br>
> >>>>>>> - }<br>
> >>>>>>> + unsigned varEnd = ByvalEmitter.emitConstantLoad(BB, MI, dl,<br>
> >>>>>>> + LoopSize);<br>
> >>>>>>> BB->addSuccessor(loopMBB);<br>
> >>>>>>><br>
> >>>>>>> // Generate the loop body:<br>
> >>>>>>> @@ -7450,12 +7782,12 @@ EmitStructByval(MachineInstr *MI,<br>
> >> Machin<br>
> >>>>>>> // destPhi = PHI(destLoop, dst)<br>
> >>>>>>> MachineBasicBlock *entryBB = BB; BB = loopMBB;<br>
> >>>>>>> - unsigned varLoop = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> - unsigned varPhi = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> - unsigned srcLoop = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> - unsigned srcPhi = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> - unsigned destLoop = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> - unsigned destPhi = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> + unsigned varLoop =<br>
> >>>>>>> + MRI.createVirtualRegister(ByvalEmitter.getTRC());<br>
> >>>>>>> + unsigned varPhi =<br>
> >>>>>>> + MRI.createVirtualRegister(ByvalEmitter.getTRC());<br>
> >>>>>>> + unsigned srcLoop =<br>
> >>>>>>> + MRI.createVirtualRegister(ByvalEmitter.getTRC());<br>
> >>>>>>> + unsigned srcPhi =<br>
> >>>>>>> + MRI.createVirtualRegister(ByvalEmitter.getTRC());<br>
> >>>>>>> + unsigned destLoop =<br>
> >>>>>>> + MRI.createVirtualRegister(ByvalEmitter.getTRC());<br>
> >>>>>>> + unsigned destPhi =<br>
> >>>>>>> + MRI.createVirtualRegister(ByvalEmitter.getTRC());<br>
> >>>>>>><br>
> >>>>>>> BuildMI(*BB, BB->begin(), dl, TII->get(ARM::PHI), varPhi)<br>
> >>>>>>> .addReg(varLoop).addMBB(loopMBB) @@ -7469,39 +7801,16 @@<br>
> >>>>>>> EmitStructByval(MachineInstr *MI, Machin<br>
> >>>>>>><br>
> >>>>>>> // [scratch, srcLoop] = LDR_POST(srcPhi, UnitSize)<br>
> >>>>>>> // [destLoop] = STR_POST(scratch, destPhi, UnitSiz)<br>
> >>>>>>> - unsigned scratch = MRI.createVirtualRegister(UnitSize >= 8 ?<br>
> >>>>>>> TRC_Vec:TRC);<br>
> >>>>>>> - if (UnitSize >= 8) {<br>
> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(ldrOpc), scratch)<br>
> >>>>>>> - .addReg(srcLoop,<br>
> RegState::Define).addReg(srcPhi).addImm(0));<br>
> >>>>>>> -<br>
> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(strOpc), destLoop)<br>
> >>>>>>> - .addReg(destPhi).addImm(0).addReg(scratch));<br>
> >>>>>>> - } else if (isThumb2) {<br>
> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(ldrOpc), scratch)<br>
> >>>>>>> - .addReg(srcLoop,<br>
> >>>>>> RegState::Define).addReg(srcPhi).addImm(UnitSize));<br>
> >>>>>>> -<br>
> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(strOpc), destLoop)<br>
> >>>>>>> - .addReg(scratch).addReg(destPhi)<br>
> >>>>>>> - .addImm(UnitSize));<br>
> >>>>>>> - } else {<br>
> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(ldrOpc), scratch)<br>
> >>>>>>> - .addReg(srcLoop, RegState::Define).addReg(srcPhi).addReg(0)<br>
> >>>>>>> - .addImm(UnitSize));<br>
> >>>>>>> -<br>
> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(strOpc), destLoop)<br>
> >>>>>>> - .addReg(scratch).addReg(destPhi)<br>
> >>>>>>> - .addReg(0).addImm(UnitSize));<br>
> >>>>>>> + {<br>
> >>>>>>> + std::pair<unsigned, unsigned> res =<br>
> >>>>>>> + ByvalEmitter.emitUnitLoad(BB, BB->end(), dl, srcPhi,<br>
> >>> srcLoop);<br>
> >>>>>>> + unsigned scratch = res.first;<br>
> >>>>>>> + ByvalEmitter.emitUnitStore(BB, BB->end(), dl, destPhi,<br>
> >>>>>>> + scratch, destLoop);<br>
> >>>>>>> }<br>
> >>>>>>><br>
> >>>>>>> // Decrement loop variable by UnitSize.<br>
> >>>>>>> - MachineInstrBuilder MIB = BuildMI(BB, dl,<br>
> >>>>>>> - TII->get(isThumb2 ? ARM::t2SUBri : ARM::SUBri), varLoop);<br>
> >>>>>>> -<br>
> >>>>>>><br>
> >>>><br>
> AddDefaultCC(AddDefaultPred(MIB.addReg(varPhi).addImm(UnitSize)));<br>
> >>>>>>> - MIB->getOperand(5).setReg(ARM::CPSR);<br>
> >>>>>>> - MIB->getOperand(5).setIsDef(true);<br>
> >>>>>>> -<br>
> >>>>>>> - BuildMI(BB, dl, TII->get(isThumb2 ? ARM::t2Bcc : ARM::Bcc))<br>
> >>>>>>> - .addMBB(loopMBB).addImm(ARMCC::NE).addReg(ARM::CPSR);<br>
> >>>>>>> + ByvalEmitter.emitSubImm(BB, BB->end(), dl, varPhi, varLoop);<br>
> >>>>>>> + ByvalEmitter.emitBranchNE(BB, BB->end(), dl, loopMBB);<br>
> >>>>>>><br>
> >>>>>>> // loopMBB can loop back to loopMBB or fall through to exitMBB.<br>
> >>>>>>> BB->addSuccessor(loopMBB);<br>
> >>>>>>> @@ -7510,36 +7819,17 @@ EmitStructByval(MachineInstr *MI,<br>
> >> Machin<br>
> >>>> //<br>
> >>>>>>> Add epilogue to handle BytesLeft.<br>
> >>>>>>> BB = exitMBB;<br>
> >>>>>>> MachineInstr *StartOfExit = exitMBB->begin();<br>
> >>>>>>> - ldrOpc = isThumb2 ? ARM::t2LDRB_POST :<br>
> ARM::LDRB_POST_IMM;<br>
> >>>>>>> - strOpc = isThumb2 ? ARM::t2STRB_POST :<br>
> ARM::STRB_POST_IMM;<br>
> >>>>>>><br>
> >>>>>>> // [scratch, srcOut] = LDRB_POST(srcLoop, 1)<br>
> >>>>>>> // [destOut] = STRB_POST(scratch, destLoop, 1)<br>
> >>>>>>> unsigned srcIn = srcLoop;<br>
> >>>>>>> unsigned destIn = destLoop;<br>
> >>>>>>> for (unsigned i = 0; i < BytesLeft; i++) {<br>
> >>>>>>> - unsigned scratch = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> - unsigned srcOut = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> - unsigned destOut = MRI.createVirtualRegister(TRC);<br>
> >>>>>>> - if (isThumb2) {<br>
> >>>>>>> - AddDefaultPred(BuildMI(*BB, StartOfExit, dl,<br>
> >>>>>>> - TII->get(ldrOpc),scratch)<br>
> >>>>>>> - .addReg(srcOut,<br>
RegState::Define).addReg(srcIn).addImm(1));<br>
> >>>>>>> -<br>
> >>>>>>> - AddDefaultPred(BuildMI(*BB, StartOfExit, dl,<br>
> > TII->get(strOpc),<br>
> >>>>>> destOut)<br>
> >>>>>>> - .addReg(scratch).addReg(destIn)<br>
> >>>>>>> - .addImm(1));<br>
> >>>>>>> - } else {<br>
> >>>>>>> - AddDefaultPred(BuildMI(*BB, StartOfExit, dl,<br>
> >>>>>>> - TII->get(ldrOpc),scratch)<br>
> >>>>>>> - .addReg(srcOut,<br>
> >>>>>> RegState::Define).addReg(srcIn).addReg(0).addImm(1));<br>
> >>>>>>> -<br>
> >>>>>>> - AddDefaultPred(BuildMI(*BB, StartOfExit, dl,<br>
> > TII->get(strOpc),<br>
> >>>>>> destOut)<br>
> >>>>>>> - .addReg(scratch).addReg(destIn)<br>
> >>>>>>> - .addReg(0).addImm(1));<br>
> >>>>>>> - }<br>
> >>>>>>> - srcIn = srcOut;<br>
> >>>>>>> - destIn = destOut;<br>
> >>>>>>> + std::pair<unsigned, unsigned> res =<br>
> >>>>>>> + ByvalEmitter.emitByteLoad(BB, StartOfExit, dl, srcIn);<br>
> >>>>>>> + unsigned scratch = res.first;<br>
> >>>>>>> + srcIn = res.second;<br>
> >>>>>>> + destIn = ByvalEmitter.emitByteStore(BB, StartOfExit, dl,<br>
> >>>>>>> + destIn, scratch);<br>
> >>>>>>> }<br>
> >>>>>>><br>
> >>>>>>> MI->eraseFromParent(); // The instruction is gone now.<br>
> >>>>>>><br>
> >>>>>>><br>
> >>>>>>> _______________________________________________<br>
> >>>>>>> llvm-commits mailing list<br>
> >>>>>>> <a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>
> >>>>>>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
> >>>>><br>
> >>>>><br>
> >>><br>
> >>><br>
> ><br>
> ><br>
<br>
</div></div><br>_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
<br></blockquote></div><br></div>
</blockquote></div><br></body></html>