<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Turned that into an assert in <span style="font-family: Menlo; font-size: 11px;" class="">Committed revision 244366.</span><div><br class=""></div><div>Cheers,</div><div>-Q.<br class=""><blockquote type="cite" class=""><div class="">On Aug 7, 2015, at 3:34 PM, Quentin Colombet via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org" class="">llvm-commits@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><meta http-equiv="Content-Type" content="text/html charset=utf-8" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Haha, actually I cannot produce a test case!<div class=""><br class=""></div><div class="">Looking closer at the code, the <span style="font-family: Helvetica, sans-serif; font-size: 9pt;" class="">IsValidLdStrOpc thing should be an assert. We cannot form pair with R + R variant and the Opc that is fed by this function is guarded to be Load or store.</span></div><div class=""><span style="font-family: Helvetica, sans-serif; font-size: 9pt;" class=""><br class=""></span></div><div class=""><font face="Helvetica, sans-serif" class=""><span style="font-size: 9pt;" class="">I</span>’<span style="font-size: 9pt;" class="">ll update the code to reflect that, but the good news is no bug :).</span></font></div><div class=""><font face="Helvetica, sans-serif" class=""><br class=""></font></div><div class=""><font face="Helvetica, sans-serif" class="">Thanks again for checking.</font></div><div class=""><font face="Helvetica, sans-serif" class="">-Quentin</font></div><div class=""><br class=""><div class=""><blockquote type="cite" class=""><div class="">On Aug 7, 2015, at 1:32 PM, Chad Rosier <<a href="mailto:mcrosier@codeaurora.org" class="">mcrosier@codeaurora.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="WordSection1" style="page: WordSection1; font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;" class=""><span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125);" class="">Cool! Thanks for the quick response! Let me know if I can be of assistance..<o:p class=""></o:p></span></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;" class=""><span style="font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125);" class=""> </span></div><div class=""><div style="border-style: solid none none; border-top-color: rgb(225, 225, 225); border-top-width: 1pt; padding: 3pt 0in 0in;" class=""><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;" class=""><b class=""><span style="font-size: 11pt; font-family: Calibri, sans-serif;" class="">From:</span></b><span style="font-size: 11pt; font-family: Calibri, sans-serif;" class=""><span class="Apple-converted-space"> </span>Quentin Colombet [<a href="mailto:qcolombet@apple.com" class="">mailto:qcolombet@apple.com</a>]<span class="Apple-converted-space"> </span><br class=""><b class="">Sent:</b><span class="Apple-converted-space"> </span>Friday, August 07, 2015 4:30 PM<br class=""><b class="">To:</b><span class="Apple-converted-space"> </span><a href="mailto:mcrosier@codeaurora.org" class="">mcrosier@codeaurora.org</a><br class=""><b class="">Cc:</b><span class="Apple-converted-space"> </span><a href="mailto:llvm-commits@lists.llvm.org" class="">llvm-commits@lists.llvm.org</a><br class=""><b class="">Subject:</b><span class="Apple-converted-space"> </span>Re: [llvm] r231527 - [AArch64][LoadStoreOptimizer] Generate LDP + SXTW instead of LD[U]R + LD[U]RSW.<o:p class=""></o:p></span></div></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;" class=""><o:p class=""> </o:p></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;" class="">Hi Chad,<o:p class=""></o:p></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;" class="">Nice catch.<o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;" class="">There is indeed a potential bug here.<o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;" class="">Looking if I can produce a test case to expose the problem.<o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;" class=""><o:p class=""> </o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;" class="">Thanks,<o:p class=""></o:p></div></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;" class="">-Quentin<o:p class=""></o:p></div></div><div class=""><div class=""><blockquote style="margin-top: 5pt; margin-bottom: 5pt;" class="" type="cite"><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;" class="">On Aug 7, 2015, at 12:12 PM, Chad Rosier <<a href="mailto:mcrosier@codeaurora.org" style="color: purple; text-decoration: underline;" class="">mcrosier@codeaurora.org</a>> wrote:<o:p class=""></o:p></div></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;" class=""><o:p class=""> </o:p></div><div class=""><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif;" class=""><span style="font-size: 9pt; font-family: Helvetica, sans-serif;" class="">+</span><a href="mailto:llvm-commits@lists.llvm.org" style="color: purple; text-decoration: underline;" class=""><span style="font-size: 9pt; font-family: Helvetica, sans-serif;" class="">llvm-commits@lists.llvm.org</span></a><span style="font-size: 9pt; font-family: Helvetica, sans-serif;" class=""><br class=""><br class="">Hi Quentin,<br class="">I was eyeballing this change and I'm concerned we're not properly tracking<br class="">what registers are being clobbered and used.<br class=""><br class="">Inline comments below. Please let me know if I've missed something..<br class=""><br class=""> Chad<br class=""><br class="">-----Original Message-----<br class="">From:<span class="apple-converted-space"> </span></span><a href="mailto:llvm-commits-bounces@cs.uiuc.edu" style="color: purple; text-decoration: underline;" class=""><span style="font-size: 9pt; font-family: Helvetica, sans-serif;" class="">llvm-commits-bounces@cs.uiuc.edu</span></a><span style="font-size: 9pt; font-family: Helvetica, sans-serif;" class=""><br class="">[</span><a href="mailto:llvm-commits-bounces@cs.uiuc.edu" style="color: purple; text-decoration: underline;" class=""><span style="font-size: 9pt; font-family: Helvetica, sans-serif;" class="">mailto:llvm-commits-bounces@cs.uiuc.edu</span></a><span style="font-size: 9pt; font-family: Helvetica, sans-serif;" class="">] On Behalf Of Quentin Colombet<br class="">Sent: Friday, March 06, 2015 5:42 PM<br class="">To:<span class="apple-converted-space"> </span></span><a href="mailto:llvm-commits@cs.uiuc.edu" style="color: purple; text-decoration: underline;" class=""><span style="font-size: 9pt; font-family: Helvetica, sans-serif;" class="">llvm-commits@cs.uiuc.edu</span></a><span style="font-size: 9pt; font-family: Helvetica, sans-serif;" class=""><br class="">Subject: [llvm] r231527 - [AArch64][LoadStoreOptimizer] Generate LDP + SXTW<br class="">instead of LD[U]R + LD[U]RSW.<br class=""><br class="">Author: qcolombet<br class="">Date: Fri Mar 6 16:42:10 2015<br class="">New Revision: 231527<br class=""><br class="">URL:<span class="Apple-converted-space"> </span><a href="http://llvm.org/viewvc/llvm-project?rev=231527&view=rev" style="color: purple; text-decoration: underline;" class="">http://llvm.org/viewvc/llvm-project?rev=231527&view=rev</a><br class="">Log:<br class="">[AArch64][LoadStoreOptimizer] Generate LDP + SXTW instead of LD[U]R +<br class="">LD[U]RSW.<br class="">Teach the load store optimizer how to sign extend a result of a load pair<br class="">when it helps creating more pairs.<br class="">The rational is that loads are more expensive than sign extensions, so if we<br class="">gather some in one instruction this is better!<br class=""><br class=""><<a href="rdar://problem/20072968" style="color: purple; text-decoration: underline;" class="">rdar://problem/20072968</a>><br class=""><br class="">Modified:<br class=""> llvm/trunk/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp<br class=""> llvm/trunk/test/CodeGen/AArch64/arm64-ldp.ll<br class=""><br class="">Modified: llvm/trunk/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp<br class="">URL:<br class=""><a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64Loa" style="color: purple; text-decoration: underline;" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64Loa</a><br class="">dStoreOptimizer.cpp?rev=231527&r1=231526&r2=231527&view=diff<br class="">============================================================================<br class="">==<br class="">--- llvm/trunk/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp (original)<br class="">+++ llvm/trunk/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp Fri Mar<br class="">+++ 6 16:42:10 2015<br class="">@@ -63,16 +63,24 @@ struct AArch64LoadStoreOpt : public Mach<br class=""> // If a matching instruction is found, MergeForward is set to true if the<br class=""> // merge is to remove the first instruction and replace the second with<br class=""> // a pair-wise insn, and false if the reverse is true.<br class="">+ // \p SExtIdx[out] gives the index of the result of the load pair<span class="apple-converted-space"> </span><br class="">+ that // must be extended. The value of SExtIdx assumes that the<span class="apple-converted-space"> </span><br class="">+ paired load // produces the value in this order: (I, returned<span class="apple-converted-space"> </span><br class="">+ iterator), i.e., // -1 means no value has to be extended, 0 means I,<span class="apple-converted-space"> </span><br class="">+ and 1 means the // returned iterator.<br class=""> MachineBasicBlock::iterator findMatchingInsn(MachineBasicBlock::iterator<br class="">I,<br class="">- bool &MergeForward,<br class="">+ bool &MergeForward, int<span class="apple-converted-space"> </span><br class="">+ &SExtIdx,<br class=""> unsigned Limit);<br class=""> // Merge the two instructions indicated into a single pair-wise<br class="">instruction.<br class=""> // If MergeForward is true, erase the first instruction and fold its<br class=""> // operation into the second. If false, the reverse. Return the<br class="">instruction<br class=""> // following the first instruction (which may change during processing).<br class="">+ // \p SExtIdx index of the result that must be extended for a paired<br class="">load.<br class="">+ // -1 means none, 0 means I, and 1 means Paired.<br class=""> MachineBasicBlock::iterator<br class=""> mergePairedInsns(MachineBasicBlock::iterator I,<br class="">- MachineBasicBlock::iterator Paired, bool MergeForward);<br class="">+ MachineBasicBlock::iterator Paired, bool MergeForward,<br class="">+ int SExtIdx);<br class=""><br class=""> // Scan the instruction list to find a base register update that can<br class=""> // be combined with the current instruction (a load or store) using @@<br class="">-181,6 +189,43 @@ int AArch64LoadStoreOpt::getMemSize(Mach<br class=""> }<br class="">}<br class=""><br class="">+static unsigned getMatchingNonSExtOpcode(unsigned Opc,<br class="">+ bool *IsValidLdStrOpc =<br class="">+nullptr) {<br class="">+ if (IsValidLdStrOpc)<br class="">+ *IsValidLdStrOpc = true;<br class="">+ switch (Opc) {<br class="">+ default:<br class="">+ if (IsValidLdStrOpc)<br class="">+ *IsValidLdStrOpc = false;<br class="">+ return UINT_MAX;<br class="">+ case AArch64::STRDui:<br class="">+ case AArch64::STURDi:<br class="">+ case AArch64::STRQui:<br class="">+ case AArch64::STURQi:<br class="">+ case AArch64::STRWui:<br class="">+ case AArch64::STURWi:<br class="">+ case AArch64::STRXui:<br class="">+ case AArch64::STURXi:<br class="">+ case AArch64::LDRDui:<br class="">+ case AArch64::LDURDi:<br class="">+ case AArch64::LDRQui:<br class="">+ case AArch64::LDURQi:<br class="">+ case AArch64::LDRWui:<br class="">+ case AArch64::LDURWi:<br class="">+ case AArch64::LDRXui:<br class="">+ case AArch64::LDURXi:<br class="">+ case AArch64::STRSui:<br class="">+ case AArch64::STURSi:<br class="">+ case AArch64::LDRSui:<br class="">+ case AArch64::LDURSi:<br class="">+ return Opc;<br class="">+ case AArch64::LDRSWui:<br class="">+ return AArch64::LDRWui;<br class="">+ case AArch64::LDURSWi:<br class="">+ return AArch64::LDURWi;<br class="">+ }<br class="">+}<br class="">+<br class="">static unsigned getMatchingPairOpcode(unsigned Opc) {<br class=""> switch (Opc) {<br class=""> default:<br class="">@@ -282,7 +327,7 @@ static unsigned getPostIndexedOpcode(uns<br class="">MachineBasicBlock::iterator<br class="">AArch64LoadStoreOpt::mergePairedInsns(MachineBasicBlock::iterator I,<br class=""> MachineBasicBlock::iterator Paired,<br class="">- bool MergeForward) {<br class="">+ bool MergeForward, int SExtIdx) {<br class=""> MachineBasicBlock::iterator NextI = I;<br class=""> ++NextI;<br class=""> // If NextI is the second of the two instructions to be merged, we need<br class="">@@ -292,11 +337,13 @@ AArch64LoadStoreOpt::mergePairedInsns(Ma<br class=""> if (NextI == Paired)<br class=""> ++NextI;<br class=""><br class="">- bool IsUnscaled = isUnscaledLdst(I->getOpcode());<br class="">+ unsigned Opc =<br class="">+ SExtIdx == -1 ? I->getOpcode() :<span class="apple-converted-space"> </span><br class="">+ getMatchingNonSExtOpcode(I->getOpcode());<br class="">+ bool IsUnscaled = isUnscaledLdst(Opc);<br class=""> int OffsetStride =<br class=""> IsUnscaled && EnableAArch64UnscaledMemOp ? getMemSize(I) : 1;<br class=""><br class="">- unsigned NewOpc = getMatchingPairOpcode(I->getOpcode());<br class="">+ unsigned NewOpc = getMatchingPairOpcode(Opc);<br class=""> // Insert our new paired instruction after whichever of the paired<br class=""> // instructions MergeForward indicates.<br class=""> MachineBasicBlock::iterator InsertionPoint = MergeForward ? Paired : I;<br class="">@@ -311,6 +358,11 @@ AArch64LoadStoreOpt::mergePairedInsns(Ma<br class=""> Paired->getOperand(2).getImm() + OffsetStride) {<br class=""> RtMI = Paired;<br class=""> Rt2MI = I;<br class="">+ // Here we swapped the assumption made for SExtIdx.<br class="">+ // I.e., we turn ldp I, Paired into ldp Paired, I.<br class="">+ // Update the index accordingly.<br class="">+ if (SExtIdx != -1)<br class="">+ SExtIdx = (SExtIdx + 1) % 2;<br class=""> } else {<br class=""> RtMI = I;<br class=""> Rt2MI = Paired;<br class="">@@ -337,8 +389,47 @@ AArch64LoadStoreOpt::mergePairedInsns(Ma<br class=""> DEBUG(dbgs() << " ");<br class=""> DEBUG(Paired->print(dbgs()));<br class=""> DEBUG(dbgs() << " with instruction:\n ");<br class="">- DEBUG(((MachineInstr *)MIB)->print(dbgs()));<br class="">- DEBUG(dbgs() << "\n");<br class="">+<br class="">+ if (SExtIdx != -1) {<br class="">+ // Generate the sign extension for the proper result of the ldp.<br class="">+ // I.e., with X1, that would be:<br class="">+ // %W1<def> = KILL %W1, %X1<imp-def><br class="">+ // %X1<def> = SBFMXri %X1<kill>, 0, 31<br class="">+ MachineOperand &DstMO = MIB->getOperand(SExtIdx);<br class="">+ // Right now, DstMO has the extended register, since it comes from an<br class="">+ // extended opcode.<br class="">+ unsigned DstRegX = DstMO.getReg();<br class="">+ // Get the W variant of that register.<br class="">+ unsigned DstRegW = TRI->getSubReg(DstRegX, AArch64::sub_32);<br class="">+ // Update the result of LDP to use the W instead of the X variant.<br class="">+ DstMO.setReg(DstRegW);<br class="">+ DEBUG(((MachineInstr *)MIB)->print(dbgs()));<br class="">+ DEBUG(dbgs() << "\n");<br class="">+ // Make the machine verifier happy by providing a definition for<br class="">+ // the X register.<br class="">+ // Insert this definition right after the generated LDP, i.e., before<br class="">+ // InsertionPoint.<br class="">+ MachineInstrBuilder MIBKill =<br class="">+ BuildMI(*I->getParent(), InsertionPoint, I->getDebugLoc(),<br class="">+ TII->get(TargetOpcode::KILL), DstRegW)<br class="">+ .addReg(DstRegW)<br class="">+ .addReg(DstRegX, RegState::Define);<br class="">+ MIBKill->getOperand(2).setImplicit();<br class="">+ // Create the sign extension.<br class="">+ MachineInstrBuilder MIBSXTW =<br class="">+ BuildMI(*I->getParent(), InsertionPoint, I->getDebugLoc(),<br class="">+ TII->get(AArch64::SBFMXri), DstRegX)<br class="">+ .addReg(DstRegX)<br class="">+ .addImm(0)<br class="">+ .addImm(31);<br class="">+ (void)MIBSXTW;<br class="">+ DEBUG(dbgs() << " Extend operand:\n ");<br class="">+ DEBUG(((MachineInstr *)MIBSXTW)->print(dbgs()));<br class="">+ DEBUG(dbgs() << "\n");<br class="">+ } else {<br class="">+ DEBUG(((MachineInstr *)MIB)->print(dbgs()));<br class="">+ DEBUG(dbgs() << "\n");<br class="">+ }<br class=""><br class=""> // Erase the old instructions.<br class=""> I->eraseFromParent();<br class="">@@ -396,7 +487,8 @@ static int alignTo(int Num, int PowOf2) /// be combined<br class="">with the current instruction into a load/store pair.<br class="">MachineBasicBlock::iterator<br class="">AArch64LoadStoreOpt::findMatchingInsn(MachineBasicBlock::iterator I,<br class="">- bool &MergeForward, unsigned Limit) {<br class="">+ bool &MergeForward, int &SExtIdx,<br class="">+ unsigned Limit) {<br class=""> MachineBasicBlock::iterator E = I->getParent()->end();<br class=""> MachineBasicBlock::iterator MBBI = I;<br class=""> MachineInstr *FirstMI = I;<br class="">@@ -436,7 +528,19 @@ AArch64LoadStoreOpt::findMatchingInsn(Ma<br class=""> // Now that we know this is a real instruction, count it.<br class=""> ++Count;<br class=""><br class="">- if (Opc == MI->getOpcode() && MI->getOperand(2).isImm()) {<br class="">+ bool CanMergeOpc = Opc == MI->getOpcode();<br class="">+ SExtIdx = -1;<br class="">+ if (!CanMergeOpc) {<br class="">+ bool IsValidLdStrOpc;<br class="">+ unsigned NonSExtOpc = getMatchingNonSExtOpcode(Opc,<br class="">&IsValidLdStrOpc);<br class="">+ if (!IsValidLdStrOpc)<br class=""><br class="">I believe we need to add the below code on the continue path to ensure the<br class="">register defs/uses and memory operations are tracked correctly.<br class=""><br class=""> trackRegDefsUses(MI, ModifiedRegs, UsedRegs, TRI);<br class=""> if (MI->mayLoadOrStore())<br class=""> MemInsns.push_back(MI);<br class=""><br class="">Alternatively, we could allow the control flow to fall thru to the logic at<br class="">the end of the for loop that takes care of the above logic but negating the<br class="">condtion. Something like..<br class=""><br class=""> if (IsValidLdStrOpc) {<br class=""> // Opc will be the first instruction in the pair.<br class=""> SExtIdx = NonSExtOpc == (unsigned)Opc ? 1 : 0;<br class=""> CanMergeOpc = NonSExtOpc ==<br class="">getMatchingNonSExtOpcode(MI->getOpcode());<br class=""> }<br class=""><br class="">Please let me know your thoughts, Quentin. Again this is just something I<br class="">saw in passing and haven't proven it's actually a problem.<br class=""><br class="">Chad<br class=""><br class="">+ continue;<br class="">+ // Opc will be the first instruction in the pair.<br class="">+ SExtIdx = NonSExtOpc == (unsigned)Opc ? 1 : 0;<br class="">+ CanMergeOpc = NonSExtOpc ==<br class="">getMatchingNonSExtOpcode(MI->getOpcode());<br class="">+ }<br class="">+<br class="">+ if (CanMergeOpc && MI->getOperand(2).isImm()) {<br class=""> // If we've found another instruction with the same opcode, check to<br class="">see<br class=""> // if the base and offset are compatible with our starting<br class="">instruction.<br class=""> // These instructions all have scaled immediate operands, so we just<br class="">@@ -823,13 +927,14 @@ bool AArch64LoadStoreOpt::optimizeBlock(<br class=""> }<br class=""> // Look ahead up to ScanLimit instructions for a pairable<br class="">instruction.<br class=""> bool MergeForward = false;<br class="">+ int SExtIdx = -1;<br class=""> MachineBasicBlock::iterator Paired =<br class="">- findMatchingInsn(MBBI, MergeForward, ScanLimit);<br class="">+ findMatchingInsn(MBBI, MergeForward, SExtIdx, ScanLimit);<br class=""> if (Paired != E) {<br class=""> // Merge the loads into a pair. Keeping the iterator straight is a<br class=""> // pain, so we let the merge routine tell us what the next<br class="">instruction<br class=""> // is after it's done mucking about.<br class="">- MBBI = mergePairedInsns(MBBI, Paired, MergeForward);<br class="">+ MBBI = mergePairedInsns(MBBI, Paired, MergeForward, SExtIdx);<br class=""><br class=""> Modified = true;<br class=""> ++NumPairCreated;<br class=""><br class="">Modified: llvm/trunk/test/CodeGen/AArch64/arm64-ldp.ll<br class="">URL:<br class=""><a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-ld" style="color: purple; text-decoration: underline;" class="">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/arm64-ld</a><br class="">p.ll?rev=231527&r1=231526&r2=231527&view=diff<br class="">============================================================================<br class="">==<br class="">--- llvm/trunk/test/CodeGen/AArch64/arm64-ldp.ll (original)<br class="">+++ llvm/trunk/test/CodeGen/AArch64/arm64-ldp.ll Fri Mar 6 16:42:10<br class="">+++ 2015<br class="">@@ -24,6 +24,33 @@ define i64 @ldp_sext_int(i32* %p) nounwi<br class=""> ret i64 %add<br class="">}<br class=""><br class="">+; CHECK-LABEL: ldp_half_sext_res0_int:<br class="">+; CHECK: ldp w[[DST1:[0-9]+]], w[[DST2:[0-9]+]], [x0]<br class="">+; CHECK: sxtw x[[DST1]], w[[DST1]]<br class="">+define i64 @ldp_half_sext_res0_int(i32* %p) nounwind {<br class="">+ %tmp = load i32, i32* %p, align 4<br class="">+ %add.ptr = getelementptr inbounds i32, i32* %p, i64 1<br class="">+ %tmp1 = load i32, i32* %add.ptr, align 4<br class="">+ %sexttmp = sext i32 %tmp to i64<br class="">+ %sexttmp1 = zext i32 %tmp1 to i64<br class="">+ %add = add nsw i64 %sexttmp1, %sexttmp<br class="">+ ret i64 %add<br class="">+}<br class="">+<br class="">+; CHECK-LABEL: ldp_half_sext_res1_int:<br class="">+; CHECK: ldp w[[DST1:[0-9]+]], w[[DST2:[0-9]+]], [x0]<br class="">+; CHECK: sxtw x[[DST2]], w[[DST2]]<br class="">+define i64 @ldp_half_sext_res1_int(i32* %p) nounwind {<br class="">+ %tmp = load i32, i32* %p, align 4<br class="">+ %add.ptr = getelementptr inbounds i32, i32* %p, i64 1<br class="">+ %tmp1 = load i32, i32* %add.ptr, align 4<br class="">+ %sexttmp = zext i32 %tmp to i64<br class="">+ %sexttmp1 = sext i32 %tmp1 to i64<br class="">+ %add = add nsw i64 %sexttmp1, %sexttmp<br class="">+ ret i64 %add<br class="">+}<br class="">+<br class="">+<br class="">; CHECK: ldp_long<br class="">; CHECK: ldp<br class="">define i64 @ldp_long(i64* %p) nounwind { @@ -83,6 +110,39 @@ define i64<br class="">@ldur_sext_int(i32* %a) nounw<br class=""> ret i64 %tmp3<br class="">}<br class=""><br class="">+define i64 @ldur_half_sext_int_res0(i32* %a) nounwind { ; LDUR_CHK:<span class="apple-converted-space"> </span><br class="">+ldur_half_sext_int_res0<br class="">+; LDUR_CHK: ldp w[[DST1:[0-9]+]], w[[DST2:[0-9]+]], [x0, #-8]<br class="">+; LDUR_CHK: sxtw x[[DST1]], w[[DST1]]<br class="">+; LDUR_CHK-NEXT: add x{{[0-9]+}}, x[[DST2]], x[[DST1]]<br class="">+; LDUR_CHK-NEXT: ret<br class="">+ %p1 = getelementptr inbounds i32, i32* %a, i32 -1<br class="">+ %tmp1 = load i32, i32* %p1, align 2<br class="">+ %p2 = getelementptr inbounds i32, i32* %a, i32 -2<br class="">+ %tmp2 = load i32, i32* %p2, align 2<br class="">+ %sexttmp1 = zext i32 %tmp1 to i64<br class="">+ %sexttmp2 = sext i32 %tmp2 to i64<br class="">+ %tmp3 = add i64 %sexttmp1, %sexttmp2<br class="">+ ret i64 %tmp3<br class="">+}<br class="">+<br class="">+define i64 @ldur_half_sext_int_res1(i32* %a) nounwind { ; LDUR_CHK:<span class="apple-converted-space"> </span><br class="">+ldur_half_sext_int_res1<br class="">+; LDUR_CHK: ldp w[[DST1:[0-9]+]], w[[DST2:[0-9]+]], [x0, #-8]<br class="">+; LDUR_CHK: sxtw x[[DST2]], w[[DST2]]<br class="">+; LDUR_CHK-NEXT: add x{{[0-9]+}}, x[[DST2]], x[[DST1]]<br class="">+; LDUR_CHK-NEXT: ret<br class="">+ %p1 = getelementptr inbounds i32, i32* %a, i32 -1<br class="">+ %tmp1 = load i32, i32* %p1, align 2<br class="">+ %p2 = getelementptr inbounds i32, i32* %a, i32 -2<br class="">+ %tmp2 = load i32, i32* %p2, align 2<br class="">+ %sexttmp1 = sext i32 %tmp1 to i64<br class="">+ %sexttmp2 = zext i32 %tmp2 to i64<br class="">+ %tmp3 = add i64 %sexttmp1, %sexttmp2<br class="">+ ret i64 %tmp3<br class="">+}<br class="">+<br class="">+<br class="">define i64 @ldur_long(i64* %a) nounwind ssp { ; LDUR_CHK: ldur_long<br class="">; LDUR_CHK: ldp [[DST1:x[0-9]+]], [[DST2:x[0-9]+]], [x0, #-16]<br class="">@@ -152,6 +212,40 @@ define i64 @pairUpBarelyInSext(i32* %a)<br class=""> %tmp3 = add i64 %sexttmp1, %sexttmp2<br class=""> ret i64 %tmp3<br class="">}<br class="">+<br class="">+define i64 @pairUpBarelyInHalfSextRes0(i32* %a) nounwind ssp { ;<br class="">+LDUR_CHK: pairUpBarelyInHalfSextRes0 ; LDUR_CHK-NOT: ldur<br class="">+; LDUR_CHK: ldp w[[DST1:[0-9]+]], w[[DST2:[0-9]+]], [x0, #-256]<br class="">+; LDUR_CHK: sxtw x[[DST1]], w[[DST1]]<br class="">+; LDUR_CHK-NEXT: add x{{[0-9]+}}, x[[DST2]], x[[DST1]]<br class="">+; LDUR_CHK-NEXT: ret<br class="">+ %p1 = getelementptr inbounds i32, i32* %a, i64 -63<br class="">+ %tmp1 = load i32, i32* %p1, align 2<br class="">+ %p2 = getelementptr inbounds i32, i32* %a, i64 -64<br class="">+ %tmp2 = load i32, i32* %p2, align 2<br class="">+ %sexttmp1 = zext i32 %tmp1 to i64<br class="">+ %sexttmp2 = sext i32 %tmp2 to i64<br class="">+ %tmp3 = add i64 %sexttmp1, %sexttmp2<br class="">+ ret i64 %tmp3<br class="">+}<br class="">+<br class="">+define i64 @pairUpBarelyInHalfSextRes1(i32* %a) nounwind ssp { ;<br class="">+LDUR_CHK: pairUpBarelyInHalfSextRes1 ; LDUR_CHK-NOT: ldur<br class="">+; LDUR_CHK: ldp w[[DST1:[0-9]+]], w[[DST2:[0-9]+]], [x0, #-256]<br class="">+; LDUR_CHK: sxtw x[[DST2]], w[[DST2]]<br class="">+; LDUR_CHK-NEXT: add x{{[0-9]+}}, x[[DST2]], x[[DST1]]<br class="">+; LDUR_CHK-NEXT: ret<br class="">+ %p1 = getelementptr inbounds i32, i32* %a, i64 -63<br class="">+ %tmp1 = load i32, i32* %p1, align 2<br class="">+ %p2 = getelementptr inbounds i32, i32* %a, i64 -64<br class="">+ %tmp2 = load i32, i32* %p2, align 2<br class="">+ %sexttmp1 = sext i32 %tmp1 to i64<br class="">+ %sexttmp2 = zext i32 %tmp2 to i64<br class="">+ %tmp3 = add i64 %sexttmp1, %sexttmp2<br class="">+ ret i64 %tmp3<br class="">+}<br class=""><br class="">define i64 @pairUpBarelyOut(i64* %a) nounwind ssp { ; LDUR_CHK:<br class="">pairUpBarelyOut<br class=""><br class=""><br class="">_______________________________________________<br class="">llvm-commits mailing list<br class=""><a href="mailto:llvm-commits@cs.uiuc.edu" style="color: purple; text-decoration: underline;" class="">llvm-commits@cs.uiuc.edu</a><br class=""><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" style="color: purple; text-decoration: underline;" class="">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a></span></div></div></blockquote></div></div></div></div></blockquote></div><br class=""></div></div>_______________________________________________<br class="">llvm-commits mailing list<br class=""><a href="mailto:llvm-commits@lists.llvm.org" class="">llvm-commits@lists.llvm.org</a><br class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits<br class=""></div></blockquote></div><br class=""></body></html>