<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Yes, I would prefer to just create the scratch register in one place. I’m not convinced that we can’t clean it up some more, but I’d say you should go ahead and commit this. It’s a good start.<div><br><div><div>On Oct 23, 2013, at 2:35 PM, David Peixotto <<a href="mailto:dpeixott@codeaurora.org">dpeixott@codeaurora.org</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div lang="EN-US" link="blue" vlink="purple" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><div class="WordSection1" style="page: WordSection1;"><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125);">Hi, Manman and Bob. Thanks for the review.<o:p></o:p></span></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125);"> </span></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125);">Manman,<o:p></o:p></span></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125);">This patch is undoing the earlier refactoring that avoided code duplication. Bob was unhappy with the amount of source code it produced when I refactored to avoid the duplication and I produced this latest patch to revert the refactoring but keep support for thumb1.<o:p></o:p></span></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125);"> </span></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125);">Bob,<o:p></o:p></span></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125);">I had purposely defined the scratch on each branch to limit the scope of the variable and keep the definition close to the use. I can hoist it out of the conditionals if you think that is better.<o:p></o:p></span></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125);"> </span></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125);">--<span class="Apple-converted-space"> </span></span><span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125);">Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation<o:p></o:p></span></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125);"> </span></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125);"> </span></div><div style="border-style: none none none solid; border-left-color: blue; border-left-width: 1.5pt; padding: 0in 0in 0in 4pt;"><div><div style="border-style: solid none none; border-top-color: rgb(181, 196, 223); border-top-width: 1pt; padding: 3pt 0in 0in;"><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><b><span style="font-size: 10pt; font-family: Tahoma, sans-serif;">From:</span></b><span style="font-size: 10pt; font-family: Tahoma, sans-serif;"><span class="Apple-converted-space"> </span>Bob Wilson [<a href="mailto:bob.wilson@apple.com">mailto:bob.wilson@apple.com</a>]<span class="Apple-converted-space"> </span><br><b>Sent:</b><span class="Apple-converted-space"> </span>Wednesday, October 23, 2013 12:47 PM<br><b>To:</b><span class="Apple-converted-space"> </span>Manman Ren<br><b>Cc:</b><span class="Apple-converted-space"> </span>David Peixotto; <a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br><b>Subject:</b><span class="Apple-converted-space"> </span>Re: [llvm] r192915 - Refactor lowering for COPY_STRUCT_BYVAL_I32<o:p></o:p></span></div></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><o:p> </o:p></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><o:p> </o:p></div><div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">On Oct 23, 2013, at 12:32 PM, Manman Ren <<a href="mailto:manman.ren@gmail.com" style="color: purple; text-decoration: underline;">manman.ren@gmail.com</a>> wrote:<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><br><br><o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">Hi David,<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><o:p> </o:p></div></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">Can we use helper functions for emitUnitLoad|Store emitByteLoad|Store to reduce code duplication?<o:p></o:p></div></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><o:p> </o:p></div></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">Thanks,<o:p></o:p></div></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">Manman<o:p></o:p></div></div></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><o:p> </o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">Yes, that would be great, if it works.<o:p></o:p></div></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><o:p> </o:p></div></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">The only other thing I noticed is that this patch have several places where every it creates a scratch virtual register on every branch of a conditional. Those could be moved outside the conditionals.<o:p></o:p></div></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><br><br><o:p></o:p></div><div><p class="MsoNormal" style="margin: 0in 0in 12pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><o:p> </o:p></p><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">On Tue, Oct 22, 2013 at 3:50 PM, David Peixotto <<a href="mailto:dpeixott@codeaurora.org" target="_blank" style="color: purple; text-decoration: underline;">dpeixott@codeaurora.org</a>> wrote:<o:p></o:p></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">I've attached a patch that removes the class in place of inline<br>conditionals. Please help to review this patch.<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><br>-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted<br>by The Linux Foundation<br><br><br><br>> -----Original Message-----<br>> From: Bob Wilson [mailto:<a href="mailto:bob.wilson@apple.com" style="color: purple; text-decoration: underline;">bob.wilson@apple.com</a>]<o:p></o:p></div></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> Sent: Tuesday, October 22, 2013 9:15 AM<br>> To: David Peixotto<br>> Cc:<span class="Apple-converted-space"> </span><a href="mailto:llvm-commits@cs.uiuc.edu" style="color: purple; text-decoration: underline;">llvm-commits@cs.uiuc.edu</a><br>> Subject: Re: [llvm] r192915 - Refactor lowering for COPY_STRUCT_BYVAL_I32<br>><o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> Thanks. At least try it out.. If it turns out to be unreadable with all<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">the<br>> conditionals, we can reconsider.<br>><br>> On Oct 22, 2013, at 9:13 AM, David Peixotto <<a href="mailto:dpeixott@codeaurora.org" style="color: purple; text-decoration: underline;">dpeixott@codeaurora.org</a>><br>> wrote:<br>><br>> > Ok, I will make a patch to switch to using inline conditionals instead.<br>> ><br>> > -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,<br>> > hosted by The Linux Foundation<br>> ><br>> ><br>> ><br>> >> -----Original Message-----<br>> >> From: Bob Wilson [mailto:<a href="mailto:bob.wilson@apple.com" style="color: purple; text-decoration: underline;">bob.wilson@apple.com</a>]<br>> >> Sent: Monday, October 21, 2013 9:18 PM<br>> >> To: David Peixotto<br>> >> Cc:<span class="Apple-converted-space"> </span><a href="mailto:llvm-commits@cs.uiuc.edu" style="color: purple; text-decoration: underline;">llvm-commits@cs.uiuc.edu</a><span class="Apple-converted-space"> </span>LLVM<br>> >> Subject: Re: [llvm] r192915 - Refactor lowering for<br>> >> COPY_STRUCT_BYVAL_I32<br>> >><br>> >><br>> >> On Oct 21, 2013, at 7:20 PM, David Peixotto <<a href="mailto:dpeixott@codeaurora.org" style="color: purple; text-decoration: underline;">dpeixott@codeaurora.org</a>><br>> >> wrote:<br>> >><br>> >>> Hi Bob,<br>> >>><br>> >>> I agree that a generic emitter would be useful, but I'm not sure I<br>> >>> would get the time to work on such a project at this point.<br>> >><br>> >> In that case, I think it would be better to simplify this code to use<br>> >> a<br>> > more<br>> >> direct approach with conditionals.<br>> >><br>> >>><br>> >>> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora<br>> >>> Forum, hosted by The Linux Foundation<br>> >>><br>> >>><br>> >>>> -----Original Message-----<br>> >>>> From: Bob Wilson [mailto:<a href="mailto:bob.wilson@apple.com" style="color: purple; text-decoration: underline;">bob.wilson@apple.com</a>]<br>> >>>> Sent: Monday, October 21, 2013 12:12 PM<br>> >>>> To: David Peixotto<br>> >>>> Cc:<span class="Apple-converted-space"> </span><a href="mailto:llvm-commits@cs.uiuc.edu" style="color: purple; text-decoration: underline;">llvm-commits@cs.uiuc.edu</a><br>> >>>> Subject: Re: [llvm] r192915 - Refactor lowering for<br>> >>>> COPY_STRUCT_BYVAL_I32<br>> >>>><br>> >>>> It seems like it make more sense to make the StructByvalEmitter a<br>> >>>> generic "emitter" for ARM instructions. I suspect there are a<br>> >>>> number of other<br>> >>> places<br>> >>>> in ARMISelLowering that could make good use of it. Would you be<br>> >>>> willing<br>> >>> to<br>> >>>> investigate that?<br>> >>>><br>> >>>> As it stands, this really does seem like overkill for the struct<br>> >>>> byval<br>> >>> issue, but if<br>> >>>> you could make it more generally useful, then the extra code would<br>> >>>> be worthwhile.<br>> >>>><br>> >>>> On Oct 21, 2013, at 10:02 AM, David Peixotto<br>> >>>> <<a href="mailto:dpeixott@codeaurora.org" style="color: purple; text-decoration: underline;">dpeixott@codeaurora.org</a>><br>> >>>> wrote:<br>> >>>><br>> >>>>> Hi Bob,<br>> >>>>><br>> >>>>> I think your criticism is valid here. I wasn't too happy with how<br>> >>>>> much code I ended up writing for this change. When I started I<br>> >>>>> thought the code size would be about equal after implementing the<br>> >>>>> thumb1 lowering because I was getting rid of some code<br>> >>>>> duplication, but the code size for abstracting the common parts<br>> >>>>> was larger than I<br>> > had<br>> >> anticipated.<br>> >>>>> There is no fundamental problem with implementing it with<br>> >>>>> conditionals, I did it this way because I thought it would be<br>> >>>>> clearer<br>> >>> and<br>> >>>> would be easier to write correctly.<br>> >>>>><br>> >>>>> I think the way it is now has the advantage that the lowering<br>> >>>>> algorithm is clearly separated from the details of generating<br>> >>>>> machine instructions for each sub-target. I think it would be<br>> >>>>> easier to improve the algorithm with the way it is now. For<br>> >>>>> example, we are always using byte stores to copy any leftover that<br>> >>>>> does not fit into the "unit" size. So if we have a 31-byte struct<br>> >>>>> and a target that supports neon we will generate a 16-byte store<br>> >>>>> and<br>> >>>>> 15 1-byte stores. We could improve this by generating fewer stores<br>> >>>>> for the leftover (3x4-byte + 1x2byte + 1x1byte). I don't know if<br>> >>>>> we actually care about this kind of change, but I believe it would<br>> >>>>> be<br>> >>> easier to<br>> >>>> make now.<br>> >>>>><br>> >>>>> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora<br>> >>>>> Forum, hosted by The Linux Foundation<br>> >>>>><br>> >>>>><br>> >>>>>> -----Original Message-----<br>> >>>>>> From: Bob Wilson [mailto:<a href="mailto:bob.wilson@apple.com" style="color: purple; text-decoration: underline;">bob.wilson@apple.com</a>]<br>> >>>>>> Sent: Saturday, October 19, 2013 7:00 PM<br>> >>>>>> To: David Peixotto<br>> >>>>>> Cc:<span class="Apple-converted-space"> </span><a href="mailto:llvm-commits@cs.uiuc.edu" style="color: purple; text-decoration: underline;">llvm-commits@cs.uiuc.edu</a><br>> >>>>>> Subject: Re: [llvm] r192915 - Refactor lowering for<br>> >>>>>> COPY_STRUCT_BYVAL_I32<br>> >>>>>><br>> >>>>>> This is very nice and elegant, but it's an awful lot of code for<br>> >>>>>> something<br>> >>>>> that<br>> >>>>>> isn't really that complicated. It seems like overkill to me.<br>> >>>>>> Did you<br>> >>>>> consider<br>> >>>>>> implementing the Thumb1 support by just adding more<br>> conditionals?<br>> >>>>>> Is there a fundamental problem with that?<br>> >>>>>><br>> >>>>>> On Oct 17, 2013, at 12:49 PM, David Peixotto<br>> >>>>>> <<a href="mailto:dpeixott@codeaurora.org" style="color: purple; text-decoration: underline;">dpeixott@codeaurora.org</a>><br>> >>>>>> wrote:<br>> >>>>>><br>> >>>>>>> Author: dpeixott<br>> >>>>>>> Date: Thu Oct 17 14:49:22 2013<br>> >>>>>>> New Revision: 192915<br>> >>>>>>><br>> >>>>>>> URL:<span class="Apple-converted-space"> </span><a href="http://llvm.org/viewvc/llvm-project?rev=192915&view=rev" target="_blank" style="color: purple; text-decoration: underline;">http://llvm.org/viewvc/llvm-project?rev=192915&view=rev</a><br>> >>>>>>> Log:<br>> >>>>>>> Refactor lowering for COPY_STRUCT_BYVAL_I32<br>> >>>>>>><br>> >>>>>>> This commit refactors the lowering of the<br>> COPY_STRUCT_BYVAL_I32<br>> >>>>>>> pseudo-instruction in the ARM backend. We introduce a new<br>> helper<br>> >>>>>>> class that encapsulates all of the operations needed during the<br>> >>> lowering.<br>> >>>>>>> The operations are implemented for each subtarget in different<br>> >>>>>>> subclasses. Currently only arm and thumb2 subtargets are<br>> supported.<br>> >>>>>>><br>> >>>>>>> This refactoring was done to easily implement support for thumb1<br>> >>>>>>> subtargets. This initial patch does not add support for thumb1,<br>> >>>>>>> but is only a refactoring. A follow on patch will implement the<br>> >>>>>>> support for<br>> >>>>>>> thumb1 subtargets.<br>> >>>>>>><br>> >>>>>>> No intended functionality change.<br>> >>>>>>><br>> >>>>>>> Modified:<br>> >>>>>>> llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp<br>> >>>>>>><br>> >>>>>>> Modified: llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp<br>> >>>>>>> URL:<br>> >>>>>>><span class="Apple-converted-space"> </span><a href="http://llvm.org/viewvc/llvm-" target="_blank" style="color: purple; text-decoration: underline;">http://llvm.org/viewvc/llvm-</a><br>> >>>> project/llvm/trunk/lib/Target/ARM/ARMISe<br>> >>>>>>> lL owering.cpp?rev=192915&r1=192914&r2=192915&view=diff<br>> >>>>>>><br>> >>>>>><br>> >>>><br>> >><br>> ==========================================================<br>> >>>>>> ============<br>> >>>>>>> ========<br>> >>>>>>> --- llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp (original)<br>> >>>>>>> +++ llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp Thu Oct 17<br>> >>>>>>> +++ 14:49:22<br>> >>>>>>> +++ 2013<br>> >>>>>>> @@ -48,6 +48,7 @@<br>> >>>>>>> #include "llvm/Support/MathExtras.h"<br>> >>>>>>> #include "llvm/Support/raw_ostream.h"<br>> >>>>>>> #include "llvm/Target/TargetOptions.h"<br>> >>>>>>> +#include <utility><br>> >>>>>>> using namespace llvm;<br>> >>>>>>><br>> >>>>>>> STATISTIC(NumTailCalls, "Number of tail calls"); @@ -7245,8<br>> >>>>>>> +7246,430 @@ MachineBasicBlock *OtherSucc(MachineBasi<br>> >>>>>>> llvm_unreachable("Expecting a BB with two successors!"); }<br>> >>>>>>><br>> >>>>>>> -MachineBasicBlock *ARMTargetLowering::<br>> >>>>>>> -EmitStructByval(MachineInstr *MI, MachineBasicBlock *BB) const<br>> >>>>>>> {<br>> >>>>>>> +namespace {<br>> >>>>>>> +// This class is a helper for lowering the<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> +COPY_STRUCT_BYVAL_I32<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>> instruction.<br>> >>>>>>> +// It defines the operations needed to lower the byval copy. We<br>> >>>>>>> +use a helper // class because the opcodes and machine<br>> >>>>>>> +instructions are different for each // subtarget, but the<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> +overall algorithm for the lowering is the same. The //<br>> >>>>>>> +implementation of each operation will be defined separately for<br>> >>>>>>> +arm, thumb1, // and<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> +thumb2 targets by subclassing this base class. See //<br>> >>>>> ARMTargetLowering::EmitStructByval()<br>> >>>>>> for how these operations are used.<br>> >>>>>>> +class TargetStructByvalEmitter {<br>> >>>>>>> +public:<br>> >>>>>>> + TargetStructByvalEmitter(const TargetInstrInfo *TII_,<br>> >>>>>>> + MachineRegisterInfo &MRI_,<br>> >>>>>>> + const TargetRegisterClass *TRC_)<br>> >>>>>>> + : TII(TII_), MRI(MRI_), TRC(TRC_) {}<br>> >>>>>>> +<br>> >>>>>>> + // Emit a post-increment load of "unit" size. The unit size<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + is based on the // alignment of the struct being copied (4,<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + 2, or<br>> >>>>>>> + 1 bytes). Alignments higher // than 4 are handled separately<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + by using<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>> NEON instructions.<br>> >>>>>>> + //<br>> >>>>>>> + // \param baseReg the register holding the address to load.<br>> >>>>>>> + // \param baseOut the register to recieve the incremented<br>> >> address.<br>> >>>>>>> + // \returns the register holding the loaded value.<br>> >>>>>>> + virtual unsigned emitUnitLoad(MachineBasicBlock *BB,<br>> >>>>>>> + MachineInstr<br>> >>>>>> *MI,<br>> >>>>>>> + DebugLoc &dl, unsigned baseReg,<br>> >>>>>>> + unsigned baseOut) = 0;<br>> >>>>>>> +<br>> >>>>>>> + // Emit a post-increment store of "unit" size. The unit size<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + is based on the // alignment of the struct being copied (4,<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + 2, or<br>> >>>>>>> + 1 bytes). Alignments higher // than 4 are handled separately<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + by using<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>> NEON instructions.<br>> >>>>>>> + //<br>> >>>>>>> + // \param baseReg the register holding the address to store.<br>> >>>>>>> + // \param storeReg the register holding the value to store.<br>> >>>>>>> + // \param baseOut the register to recieve the incremented<br>> >> address.<br>> >>>>>>> + virtual void emitUnitStore(MachineBasicBlock *BB,<o:p></o:p></div></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + MachineInstr<br>> >> *MI,<br>> >>>>>>> + DebugLoc &dl, unsigned baseReg,<br>> >>>>>>> + unsigned<o:p></o:p></div></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>> storeReg,<br>> >>>>>>> + unsigned baseOut) = 0;<br>> >>>>>>> +<br>> >>>>>>> + // Emit a post-increment load of one byte.<br>> >>>>>>> + //<br>> >>>>>>> + // \param baseReg the register holding the address to load.<br>> >>>>>>> + // \param baseOut the register to recieve the incremented<br>> >> address.<br>> >>>>>>> + // \returns the register holding the loaded value.<br>> >>>>>>> + virtual unsigned emitByteLoad(MachineBasicBlock *BB,<br>> >>>>>>> + MachineInstr<br>> >>>>>> *MI,<br>> >>>>>>> + DebugLoc &dl, unsigned baseReg,<br>> >>>>>>> + unsigned baseOut) = 0;<br>> >>>>>>> +<br>> >>>>>>> + // Emit a post-increment store of one byte.<br>> >>>>>>> + //<br>> >>>>>>> + // \param baseReg the register holding the address to store.<br>> >>>>>>> + // \param storeReg the register holding the value to store.<br>> >>>>>>> + // \param baseOut the register to recieve the incremented<br>> >> address.<br>> >>>>>>> + virtual void emitByteStore(MachineBasicBlock *BB,<o:p></o:p></div></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + MachineInstr<br>> >> *MI,<br>> >>>>>>> + DebugLoc &dl, unsigned baseReg,<br>> >>>>>>> + unsigned<o:p></o:p></div></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>> storeReg,<br>> >>>>>>> + unsigned baseOut) = 0;<br>> >>>>>>> +<br>> >>>>>>> + // Emit a load of a constant value.<br>> >>>>>>> + //<br>> >>>>>>> + // \param Constant the register holding the address to store.<br>> >>>>>>> + // \returns the register holding the loaded value.<br>> >>>>>>> + virtual unsigned emitConstantLoad(MachineBasicBlock *BB,<br>> >>>>>> MachineInstr *MI,<br>> >>>>>>> + DebugLoc &dl, unsigned<br>> > Constant,<br>> >>>>>>> + const DataLayout *DL) = 0;<br>> >>>>>>> +<br>> >>>>>>> + // Emit a subtract of a register minus immediate, with the<br>> >>>>>>> + immediate equal to // the "unit" size. The unit size is based<br>> >>>>>>> + on the alignment of the struct // being copied (16, 8, 4, 2,<br>> >>>>>>> + or<br>> >>>>>>> + 1<br>> >>>>> bytes).<br>> >>>>>>> + //<br>> >>>>>>> + // \param InReg the register holding the initial value.<br>> >>>>>>> + // \param OutReg the register to recieve the subtracted value.<br>> >>>>>>> + virtual void emitSubImm(MachineBasicBlock *BB, MachineInstr<br>> >>>>>>> + *MI,<br>> >>>>>> DebugLoc &dl,<br>> >>>>>>> + unsigned InReg, unsigned OutReg) = 0;<br>> >>>>>>> +<br>> >>>>>>> + // Emit a branch based on a condition code of not equal.<br>> >>>>>>> + //<br>> >>>>>>> + // \param TargetBB the destination of the branch.<br>> >>>>>>> + virtual void emitBranchNE(MachineBasicBlock *BB, MachineInstr<br>> >> *MI,<br>> >>>>>>> + DebugLoc &dl, MachineBasicBlock<br>> >>>>>>> + *TargetBB) = 0;<br>> >>>>>>> +<br>> >>>>>>> + // Find the constant pool index for the given constant. This<br>> >>>>>>> + method is // implemented in the base class because it is the<br>> >>>>>>> + same for all<br>> >>>>>> subtargets.<br>> >>>>>>> + //<br>> >>>>>>> + // \param LoopSize the constant value for which the index<br>> >>>>>>> + should be<br>> >>>>>> returned.<br>> >>>>>>> + // \returns the constant pool index for the constant.<br>> >>>>>>> + unsigned getConstantPoolIndex(MachineFunction *MF, const<br>> >>>>>> DataLayout *DL,<br>> >>>>>>> + unsigned LoopSize) {<br>> >>>>>>> + MachineConstantPool *ConstantPool = MF->getConstantPool();<br>> >>>>>>> + Type *Int32Ty = Type::getInt32Ty(MF->getFunction()-<br>> >>>>> getContext());<br>> >>>>>>> + const Constant *C = ConstantInt::get(Int32Ty, LoopSize);<br>> >>>>>>> +<br>> >>>>>>> + // MachineConstantPool wants an explicit alignment.<br>> >>>>>>> + unsigned Align = DL->getPrefTypeAlignment(Int32Ty);<br>> >>>>>>> + if (Align == 0)<br>> >>>>>>> + Align = DL->getTypeAllocSize(C->getType());<br>> >>>>>>> + return ConstantPool->getConstantPoolIndex(C, Align); }<br>> >>>>>>> +<br>> >>>>>>> + // Return the register class used by the subtarget.<br>> >>>>>>> + //<br>> >>>>>>> + // \returns the target register class.<br>> >>>>>>> + const TargetRegisterClass *getTRC() const { return TRC; }<br>> >>>>>>> +<br>> >>>>>>> + virtual ~TargetStructByvalEmitter() {};<br>> >>>>>>> +<br>> >>>>>>> +protected:<br>> >>>>>>> + const TargetInstrInfo *TII;<br>> >>>>>>> + MachineRegisterInfo &MRI;<br>> >>>>>>> + const TargetRegisterClass *TRC; };<br>> >>>>>>> +<br>> >>>>>>> +class ARMStructByvalEmitter : public TargetStructByvalEmitter {<br>> >>>>>>> +public:<br>> >>>>>>> + ARMStructByvalEmitter(const TargetInstrInfo *TII,<br>> >>>>>>> +MachineRegisterInfo<br>> >>>>>> &MRI,<br>> >>>>>>> + unsigned LoadStoreSize)<br>> >>>>>>> + : TargetStructByvalEmitter(<br>> >>>>>>> + TII, MRI, (const TargetRegisterClass<br>> >>> *)&ARM::GPRRegClass),<br>> >>>>>>> + UnitSize(LoadStoreSize),<br>> >>>>>>> + UnitLdOpc(LoadStoreSize == 4<br>> >>>>>>> + ? ARM::LDR_POST_IMM<br>> >>>>>>> + : LoadStoreSize == 2<br>> >>>>>>> + ? ARM::LDRH_POST<br>> >>>>>>> + : LoadStoreSize == 1 ?<br>> >>>>>>> + ARM::LDRB_POST_IMM<br>> >>> :<br>> >>>>> 0),<br>> >>>>>>> + UnitStOpc(LoadStoreSize == 4<br>> >>>>>>> + ? ARM::STR_POST_IMM<br>> >>>>>>> + : LoadStoreSize == 2<br>> >>>>>>> + ? ARM::STRH_POST<br>> >>>>>>> + : LoadStoreSize == 1 ?<br>> >>>>>>> +ARM::STRB_POST_IMM<br>> >>>>>>> +: 0) {}<br>> >>>>>>> +<br>> >>>>>>> + unsigned emitUnitLoad(MachineBasicBlock *BB, MachineInstr<br>> >>>>>>> + *MI,<br>> >>>>>> DebugLoc &dl,<o:p></o:p></div></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + unsigned baseReg, unsigned baseOut) {<br>> >>>>>>> + unsigned scratch = MRI.createVirtualRegister(TRC);<br>> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(UnitLdOpc),<br>> >>>>>> scratch).addReg(<br>> >>>>>>> + baseOut,<br>> >>>>>> RegState::Define).addReg(baseReg).addReg(0).addImm(UnitSize));<br>> >>>>>>> + return scratch;<br>> >>>>>>> + }<br>> >>>>>>> +<br>> >>>>>>> + void emitUnitStore(MachineBasicBlock *BB, MachineInstr *MI,<br>> >>>>>> DebugLoc &dl,<br>> >>>>>>> + unsigned baseReg, unsigned storeReg,<br>> >>>>>>> + unsigned<br>> >>>>> baseOut) {<br>> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(UnitStOpc),<br>> >>>>>> baseOut).addReg(<br>> >>>>>>> + storeReg).addReg(baseReg).addReg(0).addImm(UnitSize));<br>> >>>>>>> + }<br>> >>>>>>> +<br>> >>>>>>> + unsigned emitByteLoad(MachineBasicBlock *BB, MachineInstr<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + *MI,<br>> >>>>>> DebugLoc &dl,<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + unsigned baseReg, unsigned baseOut) {<br>> >>>>>>> + unsigned scratch = MRI.createVirtualRegister(TRC);<br>> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl,<br>> >>>>>>> + TII->get(ARM::LDRB_POST_IMM),<br>> >>>>>> scratch)<br>> >>>>>>> + .addReg(baseOut,<br>> >>>>> RegState::Define).addReg(baseReg)<br>> >>>>>>> + .addReg(0).addImm(1));<br>> >>>>>>> + return scratch;<br>> >>>>>>> + }<br>> >>>>>>> +<br>> >>>>>>> + void emitByteStore(MachineBasicBlock *BB, MachineInstr *MI,<br>> >>>>>> DebugLoc &dl,<br>> >>>>>>> + unsigned baseReg, unsigned storeReg,<br>> >>>>>>> + unsigned<br>> >>>>> baseOut) {<br>> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl,<br>> >>>>>>> + TII->get(ARM::STRB_POST_IMM),<br>> >>>>>> baseOut)<br>> >>>>>>> +<br>> >>>>>>> + .addReg(storeReg).addReg(baseReg).addReg(0).addImm(1));<br>> >>>>>>> + }<br>> >>>>>>> +<br>> >>>>>>> + unsigned emitConstantLoad(MachineBasicBlock *BB,<br>> MachineInstr<br>> >>>> *MI,<br>> >>>>>>> + DebugLoc &dl, unsigned Constant,<br>> >>>>>>> + const DataLayout *DL) {<br>> >>>>>>> + unsigned constReg = MRI.createVirtualRegister(TRC);<br>> >>>>>>> + unsigned Idx = getConstantPoolIndex(BB->getParent(), DL,<br>> >>>> Constant);<br>> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl, TII-<br>> >>> get(ARM::LDRcp)).addReg(<br>> >>>>>>> + constReg,<br>> >>>>> RegState::Define).addConstantPoolIndex(Idx).addImm(0));<br>> >>>>>>> + return constReg;<br>> >>>>>>> + }<br>> >>>>>>> +<br>> >>>>>>> + void emitSubImm(MachineBasicBlock *BB, MachineInstr *MI,<br>> >>>> DebugLoc<br>> >>>>>> &dl,<br>> >>>>>>> + unsigned InReg, unsigned OutReg) {<br>> >>>>>>> + MachineInstrBuilder MIB =<br>> >>>>>>> + BuildMI(*BB, MI, dl, TII->get(ARM::SUBri), OutReg);<br>> >>>>>>> +<br>> >>>>>><br>> >> AddDefaultCC(AddDefaultPred(MIB.addReg(InReg).addImm(UnitSize)));<br>> >>>>>>> + MIB->getOperand(5).setReg(ARM::CPSR);<br>> >>>>>>> + MIB->getOperand(5).setIsDef(true);<br>> >>>>>>> + }<br>> >>>>>>> +<br>> >>>>>>> + void emitBranchNE(MachineBasicBlock *BB, MachineInstr *MI,<br>> >>>>>> DebugLoc &dl,<br>> >>>>>>> + MachineBasicBlock *TargetBB) {<br>> >>>>>>> + BuildMI(*BB, MI, dl, TII-<br>> >>>>>>> get(ARM::Bcc)).addMBB(TargetBB).addImm(ARMCC::NE)<br>> >>>>>>> + .addReg(ARM::CPSR);<br>> >>>>>>> + }<br>> >>>>>>> +<br>> >>>>>>> +private:<br>> >>>>>>> + const unsigned UnitSize;<br>> >>>>>>> + const unsigned UnitLdOpc;<br>> >>>>>>> + const unsigned UnitStOpc;<br>> >>>>>>> +};<br>> >>>>>>> +<br>> >>>>>>> +class Thumb2StructByvalEmitter : public<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> +TargetStructByvalEmitter {<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> +public:<br>> >>>>>>> + Thumb2StructByvalEmitter(const TargetInstrInfo *TII,<br>> >>>>>> MachineRegisterInfo &MRI,<br>> >>>>>>> + unsigned LoadStoreSize)<br>> >>>>>>> + : TargetStructByvalEmitter(<br>> >>>>>>> + TII, MRI, (const TargetRegisterClass<br>> >>> *)&ARM::tGPRRegClass),<br>> >>>>>>> + UnitSize(LoadStoreSize),<br>> >>>>>>> + UnitLdOpc(LoadStoreSize == 4<br>> >>>>>>> + ? ARM::t2LDR_POST<br>> >>>>>>> + : LoadStoreSize == 2<br>> >>>>>>> + ? ARM::t2LDRH_POST<br>> >>>>>>> + : LoadStoreSize == 1 ?<br>> >>>>>>> + ARM::t2LDRB_POST<br>> > :<br>> >>>>> 0),<br>> >>>>>>> + UnitStOpc(LoadStoreSize == 4<br>> >>>>>>> + ? ARM::t2STR_POST<br>> >>>>>>> + : LoadStoreSize == 2<br>> >>>>>>> + ? ARM::t2STRH_POST<br>> >>>>>>> + : LoadStoreSize == 1 ?<br>> >>>>>>> + ARM::t2STRB_POST<br>> > :<br>> >>>>>>> +0) {}<br>> >>>>>>> +<br>> >>>>>>> + unsigned emitUnitLoad(MachineBasicBlock *BB, MachineInstr<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + *MI,<br>> >>>>>> DebugLoc &dl,<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + unsigned baseReg, unsigned baseOut) {<br>> >>>>>>> + unsigned scratch = MRI.createVirtualRegister(TRC);<br>> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(UnitLdOpc),<br>> >>>>>> scratch).addReg(<br>> >>>>>>> + baseOut,<br>> >> RegState::Define).addReg(baseReg).addImm(UnitSize));<br>> >>>>>>> + return scratch;<br>> >>>>>>> + }<br>> >>>>>>> +<br>> >>>>>>> + void emitUnitStore(MachineBasicBlock *BB, MachineInstr *MI,<br>> >>>>>> DebugLoc &dl,<br>> >>>>>>> + unsigned baseReg, unsigned storeReg,<br>> >>>>>>> + unsigned<br>> >>>>> baseOut) {<br>> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(UnitStOpc),<br>> >>>>>>> + baseOut)<br>> >>>>>>> +<br>> >>>>>>> + .addReg(storeReg).addReg(baseReg).addImm(UnitSize));<br>> >>>>>>> + }<br>> >>>>>>> +<br>> >>>>>>> + unsigned emitByteLoad(MachineBasicBlock *BB, MachineInstr<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + *MI,<br>> >>>>>> DebugLoc &dl,<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + unsigned baseReg, unsigned baseOut) {<br>> >>>>>>> + unsigned scratch = MRI.createVirtualRegister(TRC);<br>> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl,<br>> >>>>>>> + TII->get(ARM::t2LDRB_POST),<br>> >>>>>> scratch)<br>> >>>>>>> + .addReg(baseOut,<br>> >>>>> RegState::Define).addReg(baseReg)<br>> >>>>>>> + .addImm(1));<br>> >>>>>>> + return scratch;<br>> >>>>>>> + }<br>> >>>>>>> +<br>> >>>>>>> + void emitByteStore(MachineBasicBlock *BB, MachineInstr *MI,<br>> >>>>>> DebugLoc &dl,<br>> >>>>>>> + unsigned baseReg, unsigned storeReg,<br>> >>>>>>> + unsigned<br>> >>>>> baseOut) {<br>> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl,<br>> >>>>>>> + TII->get(ARM::t2STRB_POST),<br>> >>>>>> baseOut)<br>> >>>>>>> +<br>> >>>>>>> + .addReg(storeReg).addReg(baseReg).addImm(1));<br>> >>>>>>> + }<br>> >>>>>>> +<br>> >>>>>>> + unsigned emitConstantLoad(MachineBasicBlock *BB,<br>> MachineInstr<br>> >>>> *MI,<br>> >>>>>>> + DebugLoc &dl, unsigned Constant,<br>> >>>>>>> + const DataLayout *DL) {<br>> >>>>>>> + unsigned VConst = MRI.createVirtualRegister(TRC);<br>> >>>>>>> + unsigned Vtmp = VConst;<br>> >>>>>>> + if ((Constant & 0xFFFF0000) != 0)<br>> >>>>>>> + Vtmp = MRI.createVirtualRegister(TRC);<br>> >>>>>>> + AddDefaultPred(BuildMI(BB, dl, TII->get(ARM::t2MOVi16),<br>> Vtmp)<br>> >>>>>>> + .addImm(Constant & 0xFFFF));<br>> >>>>>>> +<br>> >>>>>>> + if ((Constant & 0xFFFF0000) != 0)<br>> >>>>>>> + AddDefaultPred(BuildMI(BB, dl, TII->get(ARM::t2MOVTi16),<br>> >>> VConst)<br>> >>>>>>> + .addReg(Vtmp).addImm(Constant >> 16));<br>> >>>>>>> + return VConst;<br>> >>>>>>> + }<br>> >>>>>>> +<br>> >>>>>>> + void emitSubImm(MachineBasicBlock *BB, MachineInstr *MI,<br>> >>>> DebugLoc<br>> >>>>>> &dl,<br>> >>>>>>> + unsigned InReg, unsigned OutReg) {<br>> >>>>>>> + MachineInstrBuilder MIB =<br>> >>>>>>> + BuildMI(*BB, MI, dl, TII->get(ARM::t2SUBri), OutReg);<br>> >>>>>>> +<br>> >>>>>><br>> >> AddDefaultCC(AddDefaultPred(MIB.addReg(InReg).addImm(UnitSize)));<br>> >>>>>>> + MIB->getOperand(5).setReg(ARM::CPSR);<br>> >>>>>>> + MIB->getOperand(5).setIsDef(true);<br>> >>>>>>> + }<br>> >>>>>>> +<br>> >>>>>>> + void emitBranchNE(MachineBasicBlock *BB, MachineInstr *MI,<br>> >>>>>> DebugLoc &dl,<br>> >>>>>>> + MachineBasicBlock *TargetBB) {<br>> >>>>>>> + BuildMI(BB, dl, TII-<br>> >>>>>>> get(ARM::t2Bcc)).addMBB(TargetBB).addImm(ARMCC::NE)<br>> >>>>>>> + .addReg(ARM::CPSR);<br>> >>>>>>> + }<br>> >>>>>>> +<br>> >>>>>>> +private:<br>> >>>>>>> + const unsigned UnitSize;<br>> >>>>>>> + const unsigned UnitLdOpc;<br>> >>>>>>> + const unsigned UnitStOpc;<br>> >>>>>>> +};<br>> >>>>>>> +<br>> >>>>>>> +// This class is a thin wrapper that delegates most of the work<br>> >>>>>>> +to the correct // TargetStructByvalEmitter implementation. It<br>> >>>>>>> +also handles the lowering for // targets that support neon<br>> >>>>>>> +because the neon implementation is the same for all // targets<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> +that<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >> support it.<br>> >>>>>>> +class StructByvalEmitter {<br>> >>>>>>> +public:<br>> >>>>>>> + StructByvalEmitter(unsigned LoadStoreSize, const ARMSubtarget<br>> >>>>>> *Subtarget,<br>> >>>>>>> + const TargetInstrInfo *TII_,<br>> >>>>>>> + MachineRegisterInfo<br>> >>>>> &MRI_,<br>> >>>>>>> + const DataLayout *DL_)<br>> >>>>>>> + : UnitSize(LoadStoreSize),<br>> >>>>>>> + TargetEmitter(<br>> >>>>>>> + Subtarget->isThumb2()<br>> >>>>>>> + ? static_cast<TargetStructByvalEmitter *>(<br>> >>>>>>> + new Thumb2StructByvalEmitter(TII_, MRI_,<br>> >>>>>>> + LoadStoreSize))<br>> >>>>>>> + : static_cast<TargetStructByvalEmitter *>(<br>> >>>>>>> + new ARMStructByvalEmitter(TII_, MRI_,<br>> >>>>>>> + LoadStoreSize))),<br>> >>>>>>> + TII(TII_), MRI(MRI_), DL(DL_),<br>> >>>>>>> + VecTRC(UnitSize == 16<br>> >>>>>>> + ? (const TargetRegisterClass<br>> > *)&ARM::DPairRegClass<br>> >>>>>>> + : UnitSize == 8<br>> >>>>>>> + ? (const TargetRegisterClass<br>> >>>>> *)&ARM::DPRRegClass<br>> >>>>>>> + : 0),<br>> >>>>>>> + VecLdOpc(UnitSize == 16 ? ARM::VLD1q32wb_fixed<br>> >>>>>>> + : UnitSize == 8 ?<br>> >>>>>>> + ARM::VLD1d32wb_fixed<br>> >>>>> : 0),<br>> >>>>>>> + VecStOpc(UnitSize == 16 ? ARM::VST1q32wb_fixed<br>> >>>>>>> + : UnitSize == 8 ?<br>> >>>>>>> +ARM::VST1d32wb_fixed : 0) {}<br>> >>>>>>> +<br>> >>>>>>> + // Emit a post-increment load of "unit" size. The unit size<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + is based on the // alignment of the struct being copied (16,<br>> >>>>>>> + 8, 4, 2, or 1 bytes). Loads of 16 // or 8 bytes use NEON<br>> >>>>>>> + instructions to load<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>> the<br>> >>>>>> value.<br>> >>>>>>> + //<br>> >>>>>>> + // \param baseReg the register holding the address to load.<br>> >>>>>>> + // \param baseOut the register to recieve the incremented<br>> >> address.<br>> >>>>>>> + If baseOut // is 0 then a new register is created to hold the<br>> >>>>> incremented<br>> >>>>>> address.<br>> >>>>>>> + // \returns a pair of registers holding the loaded value and<br>> >>>>>>> + the updated // address.<br>> >>>>>>> + std::pair<unsigned, unsigned> emitUnitLoad(MachineBasicBlock<br>> >> *BB,<br>> >>>>>>> + MachineInstr *MI,<br>> >>>>>>> + DebugLoc<br>> >>>>> &dl,<br>> >>>>>>> + unsigned baseReg,<br>> >>>>>>> + unsigned baseOut =<br>> >>>>>>> + 0)<br>> > {<br>> >>>>>>> + unsigned scratch = 0;<br>> >>>>>>> + if (baseOut == 0)<br>> >>>>>>> + baseOut = MRI.createVirtualRegister(TargetEmitter-<br>> >getTRC());<br>> >>>>>>> + if (UnitSize >= 8) { // neon<br>> >>>>>>> + scratch = MRI.createVirtualRegister(VecTRC);<br>> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(VecLdOpc),<br>> >>>>>> scratch).addReg(<br>> >>>>>>> + baseOut, RegState::Define).addReg(baseReg).addImm(0));<br>> >>>>>>> + } else {<br>> >>>>>>> + scratch = TargetEmitter->emitUnitLoad(BB, MI, dl,<br>> >>>>>>> + baseReg,<br>> >>>>> baseOut);<br>> >>>>>>> + }<br>> >>>>>>> + return std::make_pair(scratch, baseOut); }<br>> >>>>>>> +<br>> >>>>>>> + // Emit a post-increment store of "unit" size. The unit size<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + is based on the // alignment of the struct being copied (16,<br>> >>>>>>> + 8, 4, 2, or 1 bytes). Stores of // 16 or 8 bytes use NEON<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + instructions to<br>> >>>>> store the<br>> >>>>>> value.<br>> >>>>>>> + //<br>> >>>>>>> + // \param baseReg the register holding the address to store.<br>> >>>>>>> + // \param storeReg the register holding the value to store.<br>> >>>>>>> + // \param baseOut the register to recieve the incremented<br>> >> address.<br>> >>>>>>> + If baseOut // is 0 then a new register is created to hold the<br>> >>>>> incremented<br>> >>>>>> address.<br>> >>>>>>> + // \returns the register holding the updated address.<br>> >>>>>>> + unsigned emitUnitStore(MachineBasicBlock *BB, MachineInstr<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + *MI,<br>> >>>>>> DebugLoc &dl,<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + unsigned baseReg, unsigned storeReg,<br>> >>>>>>> + unsigned baseOut = 0) {<br>> >>>>>>> + if (baseOut == 0)<br>> >>>>>>> + baseOut = MRI.createVirtualRegister(TargetEmitter-<br>> >getTRC());<br>> >>>>>>> + if (UnitSize >= 8) { // neon<br>> >>>>>>> + AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(VecStOpc),<br>> >>> baseOut)<br>> >>>>>>> +<br>> >>> .addReg(baseReg).addImm(0).addReg(storeReg));<br>> >>>>>>> + } else {<br>> >>>>>>> + TargetEmitter->emitUnitStore(BB, MI, dl, baseReg,<br>> >>>>>>> + storeReg,<br>> >>>>>> baseOut);<br>> >>>>>>> + }<br>> >>>>>>> + return baseOut;<br>> >>>>>>> + }<br>> >>>>>>> +<br>> >>>>>>> + // Emit a post-increment load of one byte.<br>> >>>>>>> + //<br>> >>>>>>> + // \param baseReg the register holding the address to load.<br>> >>>>>>> + // \returns a pair of registers holding the loaded value and<br>> >>>>>>> + the updated // address.<br>> >>>>>>> + std::pair<unsigned, unsigned> emitByteLoad(MachineBasicBlock<br>> >> *BB,<br>> >>>>>>> + MachineInstr *MI,<br>> >>>>>>> + DebugLoc<br>> >>>>> &dl,<br>> >>>>>>> + unsigned baseReg) {<br>> >>>>>>> + unsigned baseOut = MRI.createVirtualRegister(TargetEmitter-<br>> >>>>>>> getTRC());<br>> >>>>>>> + unsigned scratch =<br>> >>>>>>> + TargetEmitter->emitByteLoad(BB, MI, dl, baseReg,<br>baseOut);<br>> >>>>>>> + return std::make_pair(scratch, baseOut); }<br>> >>>>>>> +<br>> >>>>>>> + // Emit a post-increment store of one byte.<br>> >>>>>>> + //<br>> >>>>>>> + // \param baseReg the register holding the address to store.<br>> >>>>>>> + // \param storeReg the register holding the value to store.<br>> >>>>>>> + // \returns the register holding the updated address.<br>> >>>>>>> + unsigned emitByteStore(MachineBasicBlock *BB, MachineInstr<br>> >> *MI,<br>> >>>>>> DebugLoc &dl,<br>> >>>>>>> + unsigned baseReg, unsigned storeReg) {<br>> >>>>>>> + unsigned baseOut = MRI.createVirtualRegister(TargetEmitter-<br>> >>>>>>> getTRC());<br>> >>>>>>> + TargetEmitter->emitByteStore(BB, MI, dl, baseReg, storeReg,<br>> >>>>> baseOut);<br>> >>>>>>> + return baseOut;<br>> >>>>>>> + }<br>> >>>>>>> +<br>> >>>>>>> + // Emit a load of the constant LoopSize.<br>> >>>>>>> + //<br>> >>>>>>> + // \param LoopSize the constant to load.<br>> >>>>>>> + // \returns the register holding the loaded constant.<br>> >>>>>>> + unsigned emitConstantLoad(MachineBasicBlock *BB,<br>> MachineInstr<br>> >>>> *MI,<br>> >>>>>>> + DebugLoc &dl, unsigned LoopSize) {<br>> >>>>>>> + return TargetEmitter->emitConstantLoad(BB, MI, dl,<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> + LoopSize, DL); }<o:p></o:p></div><div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> +<br>> >>>>>>> + // Emit a subtract of a register minus immediate, with the<br>> >>>>>>> + immediate equal to // the "unit" size. The unit size is based<br>> >>>>>>> + on the alignment of the struct // being copied (16, 8, 4, 2,<br>> >>>>>>> + or<br>> >>>>>>> + 1<br>> >>>>> bytes).<br>> >>>>>>> + //<br>> >>>>>>> + // \param InReg the register holding the initial value.<br>> >>>>>>> + // \param OutReg the register to recieve the subtracted value.<br>> >>>>>>> + void emitSubImm(MachineBasicBlock *BB, MachineInstr *MI,<br>> >>>> DebugLoc<br>> >>>>>> &dl,<br>> >>>>>>> + unsigned InReg, unsigned OutReg) {<br>> >>>>>>> + TargetEmitter->emitSubImm(BB, MI, dl, InReg, OutReg); }<br>> >>>>>>> +<br>> >>>>>>> + // Emit a branch based on a condition code of not equal.<br>> >>>>>>> + //<br>> >>>>>>> + // \param TargetBB the destination of the branch.<br>> >>>>>>> + void emitBranchNE(MachineBasicBlock *BB, MachineInstr *MI,<br>> >>>>>> DebugLoc &dl,<br>> >>>>>>> + MachineBasicBlock *TargetBB) {<br>> >>>>>>> + TargetEmitter->emitBranchNE(BB, MI, dl, TargetBB); }<br>> >>>>>>> +<br>> >>>>>>> + // Return the register class used by the subtarget.<br>> >>>>>>> + //<br>> >>>>>>> + // \returns the target register class.<br>> >>>>>>> + const TargetRegisterClass *getTRC() const { return<br>> >>>>>>> + TargetEmitter->getTRC(); }<br>> >>>>>>> +<br>> >>>>>>> +private:<br>> >>>>>>> + const unsigned UnitSize;<br>> >>>>>>> + OwningPtr<TargetStructByvalEmitter> TargetEmitter;<br>> >>>>>>> + const TargetInstrInfo *TII;<br>> >>>>>>> + MachineRegisterInfo &MRI;<br>> >>>>>>> + const DataLayout *DL;<br>> >>>>>>> +<br>> >>>>>>> + const TargetRegisterClass *VecTRC;<br>> >>>>>>> + const unsigned VecLdOpc;<br>> >>>>>>> + const unsigned VecStOpc;<br>> >>>>>>> +};<br>> >>>>>>> +}<br>> >>>>>>> +<br>> >>>>>>> +MachineBasicBlock *<br>> >>>>>>> +ARMTargetLowering::EmitStructByval(MachineInstr *MI,<br>> >>>>>>> + MachineBasicBlock *BB) const<o:p></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> +{<o:p></o:p></div><div><p class="MsoNormal" style="margin: 0in 0in 12pt; font-size: 12pt; font-family: 'Times New Roman', serif;">> >>>>>>> // This pseudo instruction has 3 operands: dst, src, size // We<br>> >>>>>>> expand it to a loop if size > Subtarget-<br>> >>>>>>> getMaxInlineSizeThreshold().<br>> >>>>>>> // Otherwise, we will generate unrolled scalar copies.<br>> >>>>>>> @@ -7261,23 +7684,13 @@ EmitStructByval(MachineInstr *MI,<br>> >> Machin<br>> >>>>>>> unsigned Align = MI->getOperand(3).getImm(); DebugLoc dl =<br>> >>>>>>> MI->getDebugLoc();<br>> >>>>>>><br>> >>>>>>> - bool isThumb2 = Subtarget->isThumb2(); MachineFunction *MF<br>> =<br>> >>>>>>> BB->getParent(); MachineRegisterInfo &MRI = MF->getRegInfo();<br>> >>>>>>> - unsigned ldrOpc, strOpc, UnitSize = 0;<br>> >>>>>>> -<br>> >>>>>>> - const TargetRegisterClass *TRC = isThumb2 ?<br>> >>>>>>> - (const TargetRegisterClass*)&ARM::tGPRRegClass :<br>> >>>>>>> - (const TargetRegisterClass*)&ARM::GPRRegClass;<br>> >>>>>>> - const TargetRegisterClass *TRC_Vec = 0;<br>> >>>>>>> + unsigned UnitSize = 0;<br>> >>>>>>><br>> >>>>>>> if (Align & 1) {<br>> >>>>>>> - ldrOpc = isThumb2 ? ARM::t2LDRB_POST :<br>> ARM::LDRB_POST_IMM;<br>> >>>>>>> - strOpc = isThumb2 ? ARM::t2STRB_POST :<br>> ARM::STRB_POST_IMM;<br>> >>>>>>> UnitSize = 1;<br>> >>>>>>> } else if (Align & 2) {<br>> >>>>>>> - ldrOpc = isThumb2 ? ARM::t2LDRH_POST : ARM::LDRH_POST;<br>> >>>>>>> - strOpc = isThumb2 ? ARM::t2STRH_POST : ARM::STRH_POST;<br>> >>>>>>> UnitSize = 2;<br>> >>>>>>> } else {<br>> >>>>>>> // Check whether we can use NEON instructions.<br>> >>>>>>> @@ -7285,27 +7698,18 @@ EmitStructByval(MachineInstr *MI,<br>> >> Machin<br>> >>>>>>> hasAttribute(AttributeSet::FunctionIndex,<br>> >>>>>>> Attribute::NoImplicitFloat) &&<br>> >>>>>>> Subtarget->hasNEON()) {<br>> >>>>>>> - if ((Align % 16 == 0) && SizeVal >= 16) {<br>> >>>>>>> - ldrOpc = ARM::VLD1q32wb_fixed;<br>> >>>>>>> - strOpc = ARM::VST1q32wb_fixed;<br>> >>>>>>> + if ((Align % 16 == 0) && SizeVal >= 16)<br>> >>>>>>> UnitSize = 16;<br>> >>>>>>> - TRC_Vec = (const<br>TargetRegisterClass*)&ARM::DPairRegClass;<br>> >>>>>>> - }<br>> >>>>>>> - else if ((Align % 8 == 0) && SizeVal >= 8) {<br>> >>>>>>> - ldrOpc = ARM::VLD1d32wb_fixed;<br>> >>>>>>> - strOpc = ARM::VST1d32wb_fixed;<br>> >>>>>>> + else if ((Align % 8 == 0) && SizeVal >= 8)<br>> >>>>>>> UnitSize = 8;<br>> >>>>>>> - TRC_Vec = (const TargetRegisterClass*)&ARM::DPRRegClass;<br>> >>>>>>> - }<br>> >>>>>>> }<br>> >>>>>>> // Can't use NEON instructions.<br>> >>>>>>> - if (UnitSize == 0) {<br>> >>>>>>> - ldrOpc = isThumb2 ? ARM::t2LDR_POST :<br>> ARM::LDR_POST_IMM;<br>> >>>>>>> - strOpc = isThumb2 ? ARM::t2STR_POST :<br>> ARM::STR_POST_IMM;<br>> >>>>>>> + if (UnitSize == 0)<br>> >>>>>>> UnitSize = 4;<br>> >>>>>>> - }<br>> >>>>>>> }<br>> >>>>>>><br>> >>>>>>> + StructByvalEmitter ByvalEmitter(UnitSize, Subtarget, TII, MRI,<br>> >>>>>>> + getDataLayout());<br>> >>>>>>> unsigned BytesLeft = SizeVal % UnitSize; unsigned LoopSize =<br>> >>>>>>> SizeVal - BytesLeft;<br>> >>>>>>><br>> >>>>>>> @@ -7316,67 +7720,22 @@ EmitStructByval(MachineInstr *MI,<br>> >> Machin<br>> >>>>>>> unsigned srcIn = src;<br>> >>>>>>> unsigned destIn = dest;<br>> >>>>>>> for (unsigned i = 0; i < LoopSize; i+=UnitSize) {<br>> >>>>>>> - unsigned scratch = MRI.createVirtualRegister(UnitSize >= 8<br>?<br>> >>>>>> TRC_Vec:TRC);<br>> >>>>>>> - unsigned srcOut = MRI.createVirtualRegister(TRC);<br>> >>>>>>> - unsigned destOut = MRI.createVirtualRegister(TRC);<br>> >>>>>>> - if (UnitSize >= 8) {<br>> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl,<br>> >>>>>>> - TII->get(ldrOpc), scratch)<br>> >>>>>>> - .addReg(srcOut,<br>> > RegState::Define).addReg(srcIn).addImm(0));<br>> >>>>>>> -<br>> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(strOpc),<br>> >>> destOut)<br>> >>>>>>> - .addReg(destIn).addImm(0).addReg(scratch));<br>> >>>>>>> - } else if (isThumb2) {<br>> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl,<br>> >>>>>>> - TII->get(ldrOpc), scratch)<br>> >>>>>>> - .addReg(srcOut,<br>> >>>>> RegState::Define).addReg(srcIn).addImm(UnitSize));<br>> >>>>>>> -<br>> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(strOpc),<br>> >>> destOut)<br>> >>>>>>> - .addReg(scratch).addReg(destIn)<br>> >>>>>>> - .addImm(UnitSize));<br>> >>>>>>> - } else {<br>> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl,<br>> >>>>>>> - TII->get(ldrOpc), scratch)<br>> >>>>>>> - .addReg(srcOut,<br>RegState::Define).addReg(srcIn).addReg(0)<br>> >>>>>>> - .addImm(UnitSize));<br>> >>>>>>> -<br>> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(strOpc),<br>> >>> destOut)<br>> >>>>>>> - .addReg(scratch).addReg(destIn)<br>> >>>>>>> - .addReg(0).addImm(UnitSize));<br>> >>>>>>> - }<br>> >>>>>>> - srcIn = srcOut;<br>> >>>>>>> - destIn = destOut;<br>> >>>>>>> + std::pair<unsigned, unsigned> res =<br>> >>>>>>> + ByvalEmitter.emitUnitLoad(BB, MI, dl, srcIn);<br>> >>>>>>> + unsigned scratch = res.first;<br>> >>>>>>> + srcIn = res.second;<br>> >>>>>>> + destIn = ByvalEmitter.emitUnitStore(BB, MI, dl, destIn,<br>> >>>>>>> + scratch);<br>> >>>>>>> }<br>> >>>>>>><br>> >>>>>>> // Handle the leftover bytes with LDRB and STRB.<br>> >>>>>>> // [scratch, srcOut] = LDRB_POST(srcIn, 1) // [destOut] =<br>> >>>>>>> STRB_POST(scratch, destIn, 1)<br>> >>>>>>> - ldrOpc = isThumb2 ? ARM::t2LDRB_POST :<br>> ARM::LDRB_POST_IMM;<br>> >>>>>>> - strOpc = isThumb2 ? ARM::t2STRB_POST :<br>> ARM::STRB_POST_IMM;<br>> >>>>>>> for (unsigned i = 0; i < BytesLeft; i++) {<br>> >>>>>>> - unsigned scratch = MRI.createVirtualRegister(TRC);<br>> >>>>>>> - unsigned srcOut = MRI.createVirtualRegister(TRC);<br>> >>>>>>> - unsigned destOut = MRI.createVirtualRegister(TRC);<br>> >>>>>>> - if (isThumb2) {<br>> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl,<br>> >>>>>>> - TII->get(ldrOpc),scratch)<br>> >>>>>>> - .addReg(srcOut,<br>> > RegState::Define).addReg(srcIn).addImm(1));<br>> >>>>>>> -<br>> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(strOpc),<br>> >>> destOut)<br>> >>>>>>> - .addReg(scratch).addReg(destIn)<br>> >>>>>>> - .addImm(1));<br>> >>>>>>> - } else {<br>> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl,<br>> >>>>>>> - TII->get(ldrOpc),scratch)<br>> >>>>>>> - .addReg(srcOut, RegState::Define).addReg(srcIn)<br>> >>>>>>> - .addReg(0).addImm(1));<br>> >>>>>>> -<br>> >>>>>>> - AddDefaultPred(BuildMI(*BB, MI, dl, TII->get(strOpc),<br>> >>> destOut)<br>> >>>>>>> - .addReg(scratch).addReg(destIn)<br>> >>>>>>> - .addReg(0).addImm(1));<br>> >>>>>>> - }<br>> >>>>>>> - srcIn = srcOut;<br>> >>>>>>> - destIn = destOut;<br>> >>>>>>> + std::pair<unsigned, unsigned> res =<br>> >>>>>>> + ByvalEmitter.emitByteLoad(BB, MI, dl, srcIn);<br>> >>>>>>> + unsigned scratch = res.first;<br>> >>>>>>> + srcIn = res.second;<br>> >>>>>>> + destIn = ByvalEmitter.emitByteStore(BB, MI, dl, destIn,<br>> >>>>>>> + scratch);<br>> >>>>>>> }<br>> >>>>>>> MI->eraseFromParent(); // The instruction is gone now.<br>> >>>>>>> return BB;<br>> >>>>>>> @@ -7414,34 +7773,7 @@ EmitStructByval(MachineInstr *MI,<br>> Machin<br>> >>>>>>> exitMBB->transferSuccessorsAndUpdatePHIs(BB);<br>> >>>>>>><br>> >>>>>>> // Load an immediate to varEnd.<br>> >>>>>>> - unsigned varEnd = MRI.createVirtualRegister(TRC);<br>> >>>>>>> - if (isThumb2) {<br>> >>>>>>> - unsigned VReg1 = varEnd;<br>> >>>>>>> - if ((LoopSize & 0xFFFF0000) != 0)<br>> >>>>>>> - VReg1 = MRI.createVirtualRegister(TRC);<br>> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(ARM::t2MOVi16),<br>> VReg1)<br>> >>>>>>> - .addImm(LoopSize & 0xFFFF));<br>> >>>>>>> -<br>> >>>>>>> - if ((LoopSize & 0xFFFF0000) != 0)<br>> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(ARM::t2MOVTi16),<br>> >>> varEnd)<br>> >>>>>>> - .addReg(VReg1)<br>> >>>>>>> - .addImm(LoopSize >> 16));<br>> >>>>>>> - } else {<br>> >>>>>>> - MachineConstantPool *ConstantPool = MF->getConstantPool();<br>> >>>>>>> - Type *Int32Ty =<br>> >>> Type::getInt32Ty(MF->getFunction()->getContext());<br>> >>>>>>> - const Constant *C = ConstantInt::get(Int32Ty, LoopSize);<br>> >>>>>>> -<br>> >>>>>>> - // MachineConstantPool wants an explicit alignment.<br>> >>>>>>> - unsigned Align = getDataLayout()-<br>> >>> getPrefTypeAlignment(Int32Ty);<br>> >>>>>>> - if (Align == 0)<br>> >>>>>>> - Align = getDataLayout()->getTypeAllocSize(C->getType());<br>> >>>>>>> - unsigned Idx = ConstantPool->getConstantPoolIndex(C, Align);<br>> >>>>>>> -<br>> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(ARM::LDRcp))<br>> >>>>>>> - .addReg(varEnd, RegState::Define)<br>> >>>>>>> - .addConstantPoolIndex(Idx)<br>> >>>>>>> - .addImm(0));<br>> >>>>>>> - }<br>> >>>>>>> + unsigned varEnd = ByvalEmitter.emitConstantLoad(BB, MI, dl,<br>> >>>>>>> + LoopSize);<br>> >>>>>>> BB->addSuccessor(loopMBB);<br>> >>>>>>><br>> >>>>>>> // Generate the loop body:<br>> >>>>>>> @@ -7450,12 +7782,12 @@ EmitStructByval(MachineInstr *MI,<br>> >> Machin<br>> >>>>>>> // destPhi = PHI(destLoop, dst)<br>> >>>>>>> MachineBasicBlock *entryBB = BB; BB = loopMBB;<br>> >>>>>>> - unsigned varLoop = MRI.createVirtualRegister(TRC);<br>> >>>>>>> - unsigned varPhi = MRI.createVirtualRegister(TRC);<br>> >>>>>>> - unsigned srcLoop = MRI.createVirtualRegister(TRC);<br>> >>>>>>> - unsigned srcPhi = MRI.createVirtualRegister(TRC);<br>> >>>>>>> - unsigned destLoop = MRI.createVirtualRegister(TRC);<br>> >>>>>>> - unsigned destPhi = MRI.createVirtualRegister(TRC);<br>> >>>>>>> + unsigned varLoop =<br>> >>>>>>> + MRI.createVirtualRegister(ByvalEmitter.getTRC());<br>> >>>>>>> + unsigned varPhi =<br>> >>>>>>> + MRI.createVirtualRegister(ByvalEmitter.getTRC());<br>> >>>>>>> + unsigned srcLoop =<br>> >>>>>>> + MRI.createVirtualRegister(ByvalEmitter.getTRC());<br>> >>>>>>> + unsigned srcPhi =<br>> >>>>>>> + MRI.createVirtualRegister(ByvalEmitter.getTRC());<br>> >>>>>>> + unsigned destLoop =<br>> >>>>>>> + MRI.createVirtualRegister(ByvalEmitter.getTRC());<br>> >>>>>>> + unsigned destPhi =<br>> >>>>>>> + MRI.createVirtualRegister(ByvalEmitter.getTRC());<br>> >>>>>>><br>> >>>>>>> BuildMI(*BB, BB->begin(), dl, TII->get(ARM::PHI), varPhi)<br>> >>>>>>> .addReg(varLoop).addMBB(loopMBB) @@ -7469,39 +7801,16 @@<br>> >>>>>>> EmitStructByval(MachineInstr *MI, Machin<br>> >>>>>>><br>> >>>>>>> // [scratch, srcLoop] = LDR_POST(srcPhi, UnitSize)<br>> >>>>>>> // [destLoop] = STR_POST(scratch, destPhi, UnitSiz)<br>> >>>>>>> - unsigned scratch = MRI.createVirtualRegister(UnitSize >= 8 ?<br>> >>>>>>> TRC_Vec:TRC);<br>> >>>>>>> - if (UnitSize >= 8) {<br>> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(ldrOpc), scratch)<br>> >>>>>>> - .addReg(srcLoop,<br>> RegState::Define).addReg(srcPhi).addImm(0));<br>> >>>>>>> -<br>> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(strOpc), destLoop)<br>> >>>>>>> - .addReg(destPhi).addImm(0).addReg(scratch));<br>> >>>>>>> - } else if (isThumb2) {<br>> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(ldrOpc), scratch)<br>> >>>>>>> - .addReg(srcLoop,<br>> >>>>>> RegState::Define).addReg(srcPhi).addImm(UnitSize));<br>> >>>>>>> -<br>> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(strOpc), destLoop)<br>> >>>>>>> - .addReg(scratch).addReg(destPhi)<br>> >>>>>>> - .addImm(UnitSize));<br>> >>>>>>> - } else {<br>> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(ldrOpc), scratch)<br>> >>>>>>> - .addReg(srcLoop, RegState::Define).addReg(srcPhi).addReg(0)<br>> >>>>>>> - .addImm(UnitSize));<br>> >>>>>>> -<br>> >>>>>>> - AddDefaultPred(BuildMI(BB, dl, TII->get(strOpc), destLoop)<br>> >>>>>>> - .addReg(scratch).addReg(destPhi)<br>> >>>>>>> - .addReg(0).addImm(UnitSize));<br>> >>>>>>> + {<br>> >>>>>>> + std::pair<unsigned, unsigned> res =<br>> >>>>>>> + ByvalEmitter.emitUnitLoad(BB, BB->end(), dl, srcPhi,<br>> >>> srcLoop);<br>> >>>>>>> + unsigned scratch = res.first;<br>> >>>>>>> + ByvalEmitter.emitUnitStore(BB, BB->end(), dl, destPhi,<br>> >>>>>>> + scratch, destLoop);<br>> >>>>>>> }<br>> >>>>>>><br>> >>>>>>> // Decrement loop variable by UnitSize.<br>> >>>>>>> - MachineInstrBuilder MIB = BuildMI(BB, dl,<br>> >>>>>>> - TII->get(isThumb2 ? ARM::t2SUBri : ARM::SUBri), varLoop);<br>> >>>>>>> -<br>> >>>>>>><br>> >>>><br>> AddDefaultCC(AddDefaultPred(MIB.addReg(varPhi).addImm(UnitSize)));<br>> >>>>>>> - MIB->getOperand(5).setReg(ARM::CPSR);<br>> >>>>>>> - MIB->getOperand(5).setIsDef(true);<br>> >>>>>>> -<br>> >>>>>>> - BuildMI(BB, dl, TII->get(isThumb2 ? ARM::t2Bcc : ARM::Bcc))<br>> >>>>>>> - .addMBB(loopMBB).addImm(ARMCC::NE).addReg(ARM::CPSR);<br>> >>>>>>> + ByvalEmitter.emitSubImm(BB, BB->end(), dl, varPhi, varLoop);<br>> >>>>>>> + ByvalEmitter.emitBranchNE(BB, BB->end(), dl, loopMBB);<br>> >>>>>>><br>> >>>>>>> // loopMBB can loop back to loopMBB or fall through to exitMBB.<br>> >>>>>>> BB->addSuccessor(loopMBB);<br>> >>>>>>> @@ -7510,36 +7819,17 @@ EmitStructByval(MachineInstr *MI,<br>> >> Machin<br>> >>>> //<br>> >>>>>>> Add epilogue to handle BytesLeft.<br>> >>>>>>> BB = exitMBB;<br>> >>>>>>> MachineInstr *StartOfExit = exitMBB->begin();<br>> >>>>>>> - ldrOpc = isThumb2 ? ARM::t2LDRB_POST :<br>> ARM::LDRB_POST_IMM;<br>> >>>>>>> - strOpc = isThumb2 ? ARM::t2STRB_POST :<br>> ARM::STRB_POST_IMM;<br>> >>>>>>><br>> >>>>>>> // [scratch, srcOut] = LDRB_POST(srcLoop, 1)<br>> >>>>>>> // [destOut] = STRB_POST(scratch, destLoop, 1)<br>> >>>>>>> unsigned srcIn = srcLoop;<br>> >>>>>>> unsigned destIn = destLoop;<br>> >>>>>>> for (unsigned i = 0; i < BytesLeft; i++) {<br>> >>>>>>> - unsigned scratch = MRI.createVirtualRegister(TRC);<br>> >>>>>>> - unsigned srcOut = MRI.createVirtualRegister(TRC);<br>> >>>>>>> - unsigned destOut = MRI.createVirtualRegister(TRC);<br>> >>>>>>> - if (isThumb2) {<br>> >>>>>>> - AddDefaultPred(BuildMI(*BB, StartOfExit, dl,<br>> >>>>>>> - TII->get(ldrOpc),scratch)<br>> >>>>>>> - .addReg(srcOut,<br>RegState::Define).addReg(srcIn).addImm(1));<br>> >>>>>>> -<br>> >>>>>>> - AddDefaultPred(BuildMI(*BB, StartOfExit, dl,<br>> > TII->get(strOpc),<br>> >>>>>> destOut)<br>> >>>>>>> - .addReg(scratch).addReg(destIn)<br>> >>>>>>> - .addImm(1));<br>> >>>>>>> - } else {<br>> >>>>>>> - AddDefaultPred(BuildMI(*BB, StartOfExit, dl,<br>> >>>>>>> - TII->get(ldrOpc),scratch)<br>> >>>>>>> - .addReg(srcOut,<br>> >>>>>> RegState::Define).addReg(srcIn).addReg(0).addImm(1));<br>> >>>>>>> -<br>> >>>>>>> - AddDefaultPred(BuildMI(*BB, StartOfExit, dl,<br>> > TII->get(strOpc),<br>> >>>>>> destOut)<br>> >>>>>>> - .addReg(scratch).addReg(destIn)<br>> >>>>>>> - .addReg(0).addImm(1));<br>> >>>>>>> - }<br>> >>>>>>> - srcIn = srcOut;<br>> >>>>>>> - destIn = destOut;<br>> >>>>>>> + std::pair<unsigned, unsigned> res =<br>> >>>>>>> + ByvalEmitter.emitByteLoad(BB, StartOfExit, dl, srcIn);<br>> >>>>>>> + unsigned scratch = res.first;<br>> >>>>>>> + srcIn = res.second;<br>> >>>>>>> + destIn = ByvalEmitter.emitByteStore(BB, StartOfExit, dl,<br>> >>>>>>> + destIn, scratch);<br>> >>>>>>> }<br>> >>>>>>><br>> >>>>>>> MI->eraseFromParent(); // The instruction is gone now.<br>> >>>>>>><br>> >>>>>>><br>> >>>>>>> _______________________________________________<br>> >>>>>>> llvm-commits mailing list<br>> >>>>>>><span class="Apple-converted-space"> </span><a href="mailto:llvm-commits@cs.uiuc.edu" style="color: purple; text-decoration: underline;">llvm-commits@cs.uiuc.edu</a><br>> >>>>>>><span class="Apple-converted-space"> </span><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank" style="color: purple; text-decoration: underline;">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>> >>>>><br>> >>>>><br>> >>><br>> >>><br>> ><br>> ><o:p></o:p></p></div><p class="MsoNormal" style="margin: 0in 0in 12pt; font-size: 12pt; font-family: 'Times New Roman', serif;"><br>_______________________________________________<br>llvm-commits mailing list<br><a href="mailto:llvm-commits@cs.uiuc.edu" style="color: purple; text-decoration: underline;">llvm-commits@cs.uiuc.edu</a><br><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank" style="color: purple; text-decoration: underline;">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a></p></div></div></div></div></div></div></blockquote></div><br></div></body></html>