<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><br><div><div>On Jun 27, 2012, at 4:30 PM, Manman Ren wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><br></div>I updated the patch, please review and provide feedback.<div><div><br></div><div>Update peephole optimization for X86 and ARM:</div><div>1> Interface: added SrcReg2 to analyzeCompare and optimizeCompare.</div><div> AnalyzeCompare to analyzeCompare ...</div><div><br></div><div>2> ARM: Clean up, no functional change.</div><div>Replaced a loop with existing interfaces: modifiesRegister and readsRegister.</div><div>Factored out code into inline functions and simplified the code.</div><div><br></div><div>3> X86: added peephole optimization to remove cmp instruction</div><div>It will optimize the following:</div><div> sub r1, r3</div><div> cmp r3, r1 or cmp r1, r3</div><div> bge L1</div><div>TO</div><div> sub r1, r3</div><div> bge L1 or ble L1</div><div>If the branch instruction can use flag from "sub", then we can eliminate</div><div>the "cmp" instruction.</div><div><br></div><div>For this optimization, we need to first check the condition code of</div><div>SET|CMOV|Jcc and then update the condition code.</div><div><br></div><div><div>This patch implemented 4 helper functions:</div><div>getSwappedConditionForCMov</div><div>getSwappedConditionForBranch</div><div>getSwappedConditionForSET</div><div>isRedundantFlagInstr</div></div><div><br></div><div>I can't think of a way to easily update the condition code of an instruction</div><div>by adding flags in td file or data in MCInstrDesc, since the condition code</div><div>is hard-coded in the opcode.</div><div><br></div><div>From what I know, there are 3 pairs of equivalent falg instructions for ARM:</div><div>CMP vs SUB, CMN vs ADD, TST vs AND</div><div>2 pairs for X86:</div><div>CMP vs SUB, TST vs AND</div><div>If there are more pairs, or there is a better way to implement isRedundantFlagInstr, please let me know.</div><div><br></div><div>Comments are welcome.</div></div><div><br></div><div>Thanks,</div><div>Manman</div><div><br></div><div></div></div><span><peephole.patch></span><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><br><div><div>On Jun 2, 2012, at 3:55 PM, Chandler Carruth wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><p><br>
On Fri, Jun 1, 2012 at 12:49 PM, Manman Ren <<a href="mailto:mren@apple.com" target="_blank">mren@apple.com</a>> wrote:<br>
><br>
> Author: mren<br>
> Date: Fri Jun 1 14:49:33 2012<br>
> New Revision: 157831<br>
><br>
> URL: <a href="http://llvm.org/viewvc/llvm-project?rev=157831&view=rev" target="_blank">http://llvm.org/viewvc/llvm-project?rev=157831&view=rev</a><br>
> Log:<br>
> X86: peephole optimization to remove cmp instruction<br></p><p>I'm frustrated that you waited a total of 3 hours for review on the final version of this patch before committing it, w/o getting an LGTM. Eric even made comments on that thread, which he has had to repeat here. Please try to give the code review process a chance.<br>
</p><p>This patch doesn't feel ready for commit:</p><p>1) From my original comments: I still think the properties of instructions you're currently hard coding in C++ need to be modeled in a more principled manner. Maybe this isn't possible, I'm not a backend expert, but in code review all you said was that it didn't exist today. The point of my comment was that if these properties are not currently available as a fundamental part of the instruction, you should add them rather than writing one-off functions like this.</p><p>2) You haven't followed the coding conventions or addressed any style issues with the patch. Nor have you added doxygen comments for the APIs you've adding. I'll try to mention a few of the style issues below, but I'm just pointing out a few. It would be good to go over this and try to be consistent.</p><p>3) I still think that this is a bit of a hack to handle a single case of a more general problem. There are numerous instructions in x86 that set some subset of EFLAGS. We should have a general pass that can both re-use existing instructions' flags, and even switch instructions which have equivalent instructions (functionally) that additionally set flags. Certainly, 'cmp x, y' sets flags *differently* from binary operations other than 'sub x, y', but you're already handling cases like x and y being reversed, so it doesn't seem like a huge leap here.</p><p>4) Also, what about removing 'test' as well as 'cmp'?</p><p>><br>
> This patch will optimize the following:<br>
> sub r1, r3<br>
> cmp r3, r1 or cmp r1, r3<br>
> bge L1<br>
> TO<br>
> sub r1, r3<br>
> bge L1 or ble L1<br>
><br>
> If the branch instruction can use flag from "sub", then we can eliminate<br>
> the "cmp" instruction.<br>
><br>
> Modified:<br>
> llvm/trunk/lib/Target/X86/X86InstrArithmetic.td<br>
> llvm/trunk/lib/Target/X86/X86InstrInfo.cpp<br>
> llvm/trunk/lib/Target/X86/X86InstrInfo.h<br>
> llvm/trunk/test/CodeGen/X86/jump_sign.ll<br>
><br>
> Modified: llvm/trunk/lib/Target/X86/X86InstrArithmetic.td<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrArithmetic.td?rev=157831&r1=157830&r2=157831&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrArithmetic.td?rev=157831&r1=157830&r2=157831&view=diff</a><br>
> ==============================================================================<br>
> --- llvm/trunk/lib/Target/X86/X86InstrArithmetic.td (original)<br>
> +++ llvm/trunk/lib/Target/X86/X86InstrArithmetic.td Fri Jun 1 14:49:33 2012<br>
> @@ -1143,7 +1143,9 @@<br>
> 0, 0>;<br>
> }<br>
><br>
> +let isCompare = 1 in {<br>
> defm CMP : ArithBinOp_F<0x38, 0x3A, 0x3C, "cmp", MRM7r, MRM7m, X86cmp, 0, 0>;<br>
> +}<br>
><br>
><br>
> //===----------------------------------------------------------------------===//<br>
><br>
> Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=157831&r1=157830&r2=157831&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=157831&r1=157830&r2=157831&view=diff</a><br>
> ==============================================================================<br>
> --- llvm/trunk/lib/Target/X86/X86InstrInfo.cpp (original)<br>
> +++ llvm/trunk/lib/Target/X86/X86InstrInfo.cpp Fri Jun 1 14:49:33 2012<br>
> @@ -33,6 +33,7 @@<br>
> #include "llvm/Support/ErrorHandling.h"<br>
> #include "llvm/Support/raw_ostream.h"<br>
> #include "llvm/Target/TargetOptions.h"<br>
> +#include "llvm/ADT/Statistic.h"<br>
> #include <limits><br>
><br>
> #define GET_INSTRINFO_CTOR<br>
> @@ -40,6 +41,8 @@<br>
><br>
> using namespace llvm;<br>
><br>
> +STATISTIC(NumCmpsRemoved, "Number of Cmps removed due to an earlier Sub");<br>
> +<br>
> static cl::opt<bool><br>
> NoFusing("disable-spill-fusing",<br>
> cl::desc("Disable fusing of spill code into instructions"));<br>
> @@ -2728,6 +2731,326 @@<br>
> }<br>
><br>
> bool X86InstrInfo::<br>
> +AnalyzeCompare(const MachineInstr *MI, unsigned &SrcReg, int &CmpMask,<br>
> + int &CmpValue) const {<br></p><p>'analyzeCompare'<br>
<br>
><br>
> + switch (MI->getOpcode()) {<br>
> + default: break;<br>
> + case X86::CMP64ri32:<br>
> + case X86::CMP64ri8:<br>
> + case X86::CMP32ri:<br>
> + case X86::CMP32ri8:<br>
> + case X86::CMP16ri:<br>
> + case X86::CMP16ri8:<br>
> + case X86::CMP8ri:<br>
> + SrcReg = MI->getOperand(0).getReg();<br>
> + CmpMask = ~0;<br>
> + CmpValue = MI->getOperand(1).getImm();<br>
> + return true;<br>
> + case X86::CMP64rr:<br>
> + case X86::CMP32rr:<br>
> + case X86::CMP16rr:<br>
> + case X86::CMP8rr:<br>
> + SrcReg = MI->getOperand(0).getReg();<br>
> + CmpMask = ~0;<br>
> + CmpValue = 0;<br>
> + return true;<br>
> + }<br>
> +<br>
> + return false;<br>
> +}<br>
> +<br>
> +// This function updates condition code for SET Opcodes.<br>
> +// The input condition code can be E,NE,L,LE,G,GE,B,BE,A,AE and<br>
> +// the updated condition code will be E,NE,G,GE,L,LE,A,AE,B,BE.<br>
> +// This is to convert from a > b to b < a, a >= b to b <= a etc.<br>
> +static unsigned UpdateSETCondToOptimizeCmp(unsigned SETOpc) {<br></p><p>Update... -> update...<br>
<br>
><br>
> + switch (SETOpc) {<br>
> + default: return 0;<br>
> + case X86::SETEr: return X86::SETEr;<br>
> + case X86::SETEm: return X86::SETEm;<br>
> + case X86::SETNEr: return X86::SETNEr;<br>
> + case X86::SETNEm: return X86::SETNEm;<br>
> + case X86::SETLr: return X86::SETGr;<br>
> + case X86::SETLm: return X86::SETGm;<br>
> + case X86::SETLEr: return X86::SETGEr;<br>
> + case X86::SETLEm: return X86::SETGEm;<br>
> + case X86::SETGr: return X86::SETLr;<br>
> + case X86::SETGm: return X86::SETLm;<br>
> + case X86::SETGEr: return X86::SETLEr;<br>
> + case X86::SETGEm: return X86::SETLEm;<br>
> + case X86::SETBr: return X86::SETAr;<br>
> + case X86::SETBm: return X86::SETAm;<br>
> + case X86::SETBEr: return X86::SETAEr;<br>
> + case X86::SETBEm: return X86::SETAEm;<br>
> + case X86::SETAr: return X86::SETBr;<br>
> + case X86::SETAm: return X86::SETBm;<br>
> + case X86::SETAEr: return X86::SETBEr;<br>
> + case X86::SETAEm: return X86::SETBEm;<br>
> + }<br>
> +}<br>
> +<br>
> +// This function updates condition code for Branch Opcodes.<br>
> +// The input condition code can be E,NE,L,LE,G,GE,B,BE,A,AE and<br>
> +// the updated condition code will be E,NE,G,GE,L,LE,A,AE,B,BE.<br>
> +// This is to convert from a > b to b < a, a >= b to b <= a etc.<br>
> +static unsigned UpdateBranchCondToOptimizeCmp(unsigned BranchOpc) {<br></p><p>Update -> update<br>
<br>
><br>
> + switch (BranchOpc) {<br>
> + default: return 0;<br>
> + case X86::JE_4: return X86::JE_4;<br>
> + case X86::JNE_4: return X86::JNE_4;<br>
> + case X86::JL_4: return X86::JG_4;<br>
> + case X86::JLE_4: return X86::JGE_4;<br>
> + case X86::JG_4: return X86::JL_4;<br>
> + case X86::JGE_4: return X86::JLE_4;<br>
> + case X86::JB_4: return X86::JA_4;<br>
> + case X86::JBE_4: return X86::JAE_4;<br>
> + case X86::JA_4: return X86::JB_4;<br>
> + case X86::JAE_4: return X86::JBE_4;<br>
> + }<br>
> +}<br>
> +<br>
> +// This function updates condition code for CMOV Opcodes.<br>
> +// The input condition code can be E,NE,L,LE,G,GE,B,BE,A,AE and<br>
> +// the updated condition code will be E,NE,G,GE,L,LE,A,AE,B,BE.<br>
> +// This is to convert from a > b to b < a, a >= b to b <= a etc.<br>
> +static unsigned UpdateCMovCondToOptimizeCmp(unsigned CMovOpc) {<br>
> + switch (CMovOpc) {<br>
> + default: return 0;<br>
> + case X86::CMOVE16rm: return X86::CMOVE16rm;<br>
> + case X86::CMOVE16rr: return X86::CMOVE16rr;<br>
> + case X86::CMOVE32rm: return X86::CMOVE32rm;<br>
> + case X86::CMOVE32rr: return X86::CMOVE32rr;<br>
> + case X86::CMOVE64rm: return X86::CMOVE64rm;<br>
> + case X86::CMOVE64rr: return X86::CMOVE64rr;<br>
> + case X86::CMOVNE16rm: return X86::CMOVNE16rm;<br>
> + case X86::CMOVNE16rr: return X86::CMOVNE16rr;<br>
> + case X86::CMOVNE32rm: return X86::CMOVNE32rm;<br>
> + case X86::CMOVNE32rr: return X86::CMOVNE32rr;<br>
> + case X86::CMOVNE64rm: return X86::CMOVNE64rm;<br>
> + case X86::CMOVNE64rr: return X86::CMOVNE64rr;<br>
> +<br>
> + case X86::CMOVL16rm: return X86::CMOVG16rm;<br>
> + case X86::CMOVL16rr: return X86::CMOVG16rr;<br>
> + case X86::CMOVL32rm: return X86::CMOVG32rm;<br>
> + case X86::CMOVL32rr: return X86::CMOVG32rr;<br>
> + case X86::CMOVL64rm: return X86::CMOVG64rm;<br>
> + case X86::CMOVL64rr: return X86::CMOVG64rr;<br>
> + case X86::CMOVLE16rm: return X86::CMOVGE16rm;<br>
> + case X86::CMOVLE16rr: return X86::CMOVGE16rr;<br>
> + case X86::CMOVLE32rm: return X86::CMOVGE32rm;<br>
> + case X86::CMOVLE32rr: return X86::CMOVGE32rr;<br>
> + case X86::CMOVLE64rm: return X86::CMOVGE64rm;<br>
> + case X86::CMOVLE64rr: return X86::CMOVGE64rr;<br>
> +<br>
> + case X86::CMOVG16rm: return X86::CMOVL16rm;<br>
> + case X86::CMOVG16rr: return X86::CMOVL16rr;<br>
> + case X86::CMOVG32rm: return X86::CMOVL32rm;<br>
> + case X86::CMOVG32rr: return X86::CMOVL32rr;<br>
> + case X86::CMOVG64rm: return X86::CMOVL64rm;<br>
> + case X86::CMOVG64rr: return X86::CMOVL64rr;<br>
> + case X86::CMOVGE16rm: return X86::CMOVLE16rm;<br>
> + case X86::CMOVGE16rr: return X86::CMOVLE16rr;<br>
> + case X86::CMOVGE32rm: return X86::CMOVLE32rm;<br>
> + case X86::CMOVGE32rr: return X86::CMOVLE32rr;<br>
> + case X86::CMOVGE64rm: return X86::CMOVLE64rm;<br>
> + case X86::CMOVGE64rr: return X86::CMOVLE64rr;<br>
> +<br>
> + case X86::CMOVB16rm: return X86::CMOVA16rm;<br>
> + case X86::CMOVB16rr: return X86::CMOVA16rr;<br>
> + case X86::CMOVB32rm: return X86::CMOVA32rm;<br>
> + case X86::CMOVB32rr: return X86::CMOVA32rr;<br>
> + case X86::CMOVB64rm: return X86::CMOVA64rm;<br>
> + case X86::CMOVB64rr: return X86::CMOVA64rr;<br>
> + case X86::CMOVBE16rm: return X86::CMOVAE16rm;<br>
> + case X86::CMOVBE16rr: return X86::CMOVAE16rr;<br>
> + case X86::CMOVBE32rm: return X86::CMOVAE32rm;<br>
> + case X86::CMOVBE32rr: return X86::CMOVAE32rr;<br>
> + case X86::CMOVBE64rm: return X86::CMOVAE64rm;<br>
> + case X86::CMOVBE64rr: return X86::CMOVAE64rr;<br>
> +<br>
> + case X86::CMOVA16rm: return X86::CMOVB16rm;<br>
> + case X86::CMOVA16rr: return X86::CMOVB16rr;<br>
> + case X86::CMOVA32rm: return X86::CMOVB32rm;<br>
> + case X86::CMOVA32rr: return X86::CMOVB32rr;<br>
> + case X86::CMOVA64rm: return X86::CMOVB64rm;<br>
> + case X86::CMOVA64rr: return X86::CMOVB64rr;<br>
> + case X86::CMOVAE16rm: return X86::CMOVBE16rm;<br>
> + case X86::CMOVAE16rr: return X86::CMOVBE16rr;<br>
> + case X86::CMOVAE32rm: return X86::CMOVBE32rm;<br>
> + case X86::CMOVAE32rr: return X86::CMOVBE32rr;<br>
> + case X86::CMOVAE64rm: return X86::CMOVBE64rm;<br>
> + case X86::CMOVAE64rr: return X86::CMOVBE64rr;<br>
> + }<br>
> +}<br>
> +<br>
> +bool X86InstrInfo::<br>
> +OptimizeCompareInstr(MachineInstr *CmpInstr, unsigned SrcReg, int CmpMask,<br></p><p>Optimize -> optimize<br>
<br>
><br>
> + int CmpValue, const MachineRegisterInfo *MRI) const {<br>
> + MachineRegisterInfo::def_iterator DI = MRI->def_begin(SrcReg);<br>
> + if (llvm::next(DI) != MRI->def_end())<br>
> + // Only support one definition.<br>
> + return false;<br>
> +<br>
> + MachineInstr *MI = &*DI;<br>
> + // Get ready to iterate backward from CmpInstr.<br>
> + MachineBasicBlock::iterator I = CmpInstr, E = MI,<br>
> + B = CmpInstr->getParent()->begin();<br></p><p>Reverse iterators please. Iterating backwards is too hard to follow otherwise. Add the iterators if they're missing.<br>
<br>
><br>
> +<br>
> + // Early exit if CmpInstr is at the beginning of the BB.<br>
> + if (I == B) return false;<br></p><p>Set up your loop variables below, where you're looping. You shouldn't need to do early exits etc, the loop simply won't iterate...<br>
<br>
><br>
> +<br>
> + // For CMPrr(r1,r2), we are looking for SUB(r1,r2) or SUB(r2,r1).<br>
> + // For CMPri(r1, CmpValue), we are looking for SUBri(r1, CmpValue).<br>
> + MachineInstr *Sub = NULL;<br>
> + unsigned SrcReg2 = 0;<br>
> + if (CmpInstr->getOpcode() == X86::CMP64rr ||<br>
> + CmpInstr->getOpcode() == X86::CMP32rr ||<br>
> + CmpInstr->getOpcode() == X86::CMP16rr ||<br>
> + CmpInstr->getOpcode() == X86::CMP8rr) {<br>
> + SrcReg2 = CmpInstr->getOperand(1).getReg();<br>
> + }<br>
> +<br>
> + // Search backwards for a SUB instruction.<br>
> + // If EFLAGS is updated in the middle, we can not remove the CMP instruction.<br>
> + --I;<br>
> + for (; I != E; --I) {<br>
> + const MachineInstr &Instr = *I;<br>
> +<br>
> + // Check whether the current instruction is SUB(r1, r2) or SUB(r2, r1).<br>
> + if (((CmpInstr->getOpcode() == X86::CMP64rr &&<br>
> + Instr.getOpcode() == X86::SUB64rr) ||<br>
> + (CmpInstr->getOpcode() == X86::CMP32rr &&<br>
> + Instr.getOpcode() == X86::SUB32rr)||<br>
> + (CmpInstr->getOpcode() == X86::CMP16rr &&<br>
> + Instr.getOpcode() == X86::SUB16rr)||<br>
> + (CmpInstr->getOpcode() == X86::CMP8rr &&<br>
> + Instr.getOpcode() == X86::SUB8rr)) &&<br>
> + ((Instr.getOperand(1).getReg() == SrcReg &&<br>
> + Instr.getOperand(2).getReg() == SrcReg2) ||<br>
> + (Instr.getOperand(1).getReg() == SrcReg2 &&<br>
> + Instr.getOperand(2).getReg() == SrcReg))) {<br>
> + Sub = &*I;<br>
> + break;<br>
> + }<br>
> +<br>
> + // Check whether the current instruction is SUBri(r1, CmpValue).<br>
> + if (((CmpInstr->getOpcode() == X86::CMP64ri32 &&<br>
> + Instr.getOpcode() == X86::SUB64ri32) ||<br>
> + (CmpInstr->getOpcode() == X86::CMP64ri8 &&<br>
> + Instr.getOpcode() == X86::SUB64ri8) ||<br>
> + (CmpInstr->getOpcode() == X86::CMP32ri &&<br>
> + Instr.getOpcode() == X86::SUB32ri) ||<br>
> + (CmpInstr->getOpcode() == X86::CMP32ri8 &&<br>
> + Instr.getOpcode() == X86::SUB32ri8) ||<br>
> + (CmpInstr->getOpcode() == X86::CMP16ri &&<br>
> + Instr.getOpcode() == X86::SUB16ri) ||<br>
> + (CmpInstr->getOpcode() == X86::CMP16ri8 &&<br>
> + Instr.getOpcode() == X86::SUB16ri8) ||<br>
> + (CmpInstr->getOpcode() == X86::CMP8ri &&<br>
> + Instr.getOpcode() == X86::SUB8ri)) &&<br>
> + CmpValue != 0 &&<br>
> + Instr.getOperand(1).getReg() == SrcReg &&<br>
> + Instr.getOperand(2).getImm() == CmpValue) {<br>
> + Sub = &*I;<br>
> + break;<br>
> + }<br>
> +<br>
> + for (unsigned IO = 0, EO = Instr.getNumOperands(); IO != EO; ++IO) {<br>
> + const MachineOperand &MO = Instr.getOperand(IO);<br>
> + if (MO.isRegMask() && MO.clobbersPhysReg(X86::EFLAGS))<br>
> + return false;<br>
> + if (!MO.isReg()) continue;<br>
> +<br>
> + // This instruction modifies or uses EFLAGS before we find a SUB.<br>
> + // We can't do this transformation.<br>
> + if (MO.getReg() == X86::EFLAGS)<br>
> + return false;<br>
> + }<br></p><p>Please hoist this loop into a helper function. Ideally on MachineInstr so others can use it. It seems generically useful.<br>
<br>
><br>
> +<br>
> + if (I == B)<br>
> + // Reaching beginning of the block.<br>
> + return false;<br>
> + }<br>
> +<br>
> + // Return false if no candidates exist.<br>
> + if (!Sub)<br>
> + return false;<br>
> + MI = Sub;<br>
> +<br>
> + switch (MI->getOpcode()) {<br>
> + default: break;<br>
> + case X86::SUB64rr:<br>
> + case X86::SUB32rr:<br>
> + case X86::SUB16rr:<br>
> + case X86::SUB8rr:<br>
> + case X86::SUB64ri32:<br>
> + case X86::SUB64ri8:<br>
> + case X86::SUB32ri:<br>
> + case X86::SUB32ri8:<br>
> + case X86::SUB16ri:<br>
> + case X86::SUB16ri8:<br>
> + case X86::SUB8ri: {<br>
> + // Scan forward from CmpInstr for the use of EFLAGS.<br>
> + // Handle the condition codes GE, L, G, LE, B, L, BE, LE.<br>
> + SmallVector<std::pair<MachineInstr*, unsigned /*NewOpc*/>, 4> OpsToUpdate;<br>
> + bool isSafe = false;</p><p>isSafe -> IsSafe.</p><p>
> + I = CmpInstr;<br>
> + E = CmpInstr->getParent()->end();<br>
> + while (!isSafe && ++I != E) {</p><p>Don't re-use loop variables this way. It's much better to declare them in the for (...) loop they are used for, and declare new ones in each loop.</p><p>
> + const MachineInstr &Instr = *I;<br>
> + for (unsigned IO = 0, EO = Instr.getNumOperands();<br>
> + !isSafe && IO != EO; ++IO) {<br>
> + const MachineOperand &MO = Instr.getOperand(IO);<br>
> + if (MO.isRegMask() && MO.clobbersPhysReg(X86::EFLAGS)) {<br>
> + isSafe = true;<br>
> + break;</p><p>This whole thing would be a lot more clear if you hoisted this into a helper function and used early exit.</p><p>
> + }<br>
> + if (!MO.isReg() || MO.getReg() != X86::EFLAGS)<br>
> + continue;<br>
> + // EFLAGS is redefined by this instruction.<br>
> + if (MO.isDef()) {<br>
> + isSafe = true;<br>
> + break;<br>
> + }<br>
> + // EFLAGS is used by this instruction.<br>
> +<br>
> + // If we have SUB(r1, r2) and CMP(r2, r1), the condition code needs<br>
> + // to be changed from r2 > r1 to r1 < r2, from r2 < r1 to r1 > r2, etc.<br>
> + unsigned NewOpc = UpdateBranchCondToOptimizeCmp(Instr.getOpcode());<br>
> + if (!NewOpc) NewOpc = UpdateSETCondToOptimizeCmp(Instr.getOpcode());<br>
> + if (!NewOpc) NewOpc = UpdateCMovCondToOptimizeCmp(Instr.getOpcode());<br>
> + if (!NewOpc) return false;<br>
> +<br>
> + // Push the MachineInstr to OpsToUpdate.<br>
> + // If it is safe to remove CmpInstr, the condition code of these<br>
> + // instructions will be modified.<br>
> + if (SrcReg2 != 0 && Sub->getOperand(1).getReg() == SrcReg2 &&<br>
> + Sub->getOperand(2).getReg() == SrcReg)<br>
> + OpsToUpdate.push_back(std::make_pair(&*I, NewOpc));<br>
> + }<br>
> + }<br>
> +<br>
> + // We may exit the loop at end of the basic block.<br>
> + // In that case, it is still safe to remove CmpInstr.<br>
> +<br>
> + // Make sure Sub instruction defines EFLAGS.<br>
> + assert(MI->getOperand(3).getReg() == X86::EFLAGS &&<br>
> + "Expected EFLAGS is the 4th operand of SUBrr or SUBri.");<br>
> + MI->getOperand(3).setIsDef(true);<br>
> + CmpInstr->eraseFromParent();<br>
> +<br>
> + // Modify the condition code of instructions in OpsToUpdate.<br>
> + for (unsigned i = 0; i < OpsToUpdate.size(); i++)</p><p>Always cache the end value in a loop variable.</p><p><br>
> + OpsToUpdate[i].first->setDesc(get(OpsToUpdate[i].second));<br>
> + NumCmpsRemoved++;<br>
> + return true;<br>
> + }<br>
> + }<br>
> +<br>
> + return false;<br>
> +}<br>
> +<br>
> +bool X86InstrInfo::<br>
> OptimizeSubInstr(MachineInstr *SubInstr, const MachineRegisterInfo *MRI) const {<br>
> // If destination is a memory operand, do not perform this optimization.<br>
> if ((SubInstr->getOpcode() != X86::SUB64rr) &&<br>
><br>
> Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.h<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.h?rev=157831&r1=157830&r2=157831&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.h?rev=157831&r1=157830&r2=157831&view=diff</a><br>
> ==============================================================================<br>
> --- llvm/trunk/lib/Target/X86/X86InstrInfo.h (original)<br>
> +++ llvm/trunk/lib/Target/X86/X86InstrInfo.h Fri Jun 1 14:49:33 2012<br>
> @@ -367,6 +367,12 @@<br>
> virtual bool OptimizeSubInstr(MachineInstr *SubInstr,<br>
> const MachineRegisterInfo *MRI) const;<br>
><br>
> + virtual bool AnalyzeCompare(const MachineInstr *MI, unsigned &SrcReg,<br>
> + int &CmpMask, int &CmpValue) const;<br>
> + virtual bool OptimizeCompareInstr(MachineInstr *CmpInstr, unsigned SrcReg,<br>
> + int CmpMask, int CmpValue,<br>
> + const MachineRegisterInfo *MRI) const;<br>
> +<br>
> private:<br>
> MachineInstr * convertToThreeAddressWithLEA(unsigned MIOpc,<br>
> MachineFunction::iterator &MFI,<br>
><br>
> Modified: llvm/trunk/test/CodeGen/X86/jump_sign.ll<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/jump_sign.ll?rev=157831&r1=157830&r2=157831&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/jump_sign.ll?rev=157831&r1=157830&r2=157831&view=diff</a><br>
> ==============================================================================<br>
> --- llvm/trunk/test/CodeGen/X86/jump_sign.ll (original)<br>
> +++ llvm/trunk/test/CodeGen/X86/jump_sign.ll Fri Jun 1 14:49:33 2012<br>
> @@ -83,6 +83,25 @@<br>
> %cond = select i1 %cmp, i32 %sub, i32 0<br>
> ret i32 %cond<br>
> }<br>
> +; redundant cmp instruction<br>
> +define i32 @l(i32 %a, i32 %b) nounwind {<br>
> +entry:<br>
> +; CHECK: l:<br>
> +; CHECK-NOT: cmp<br>
> + %cmp = icmp slt i32 %b, %a<br>
> + %sub = sub nsw i32 %a, %b<br>
> + %cond = select i1 %cmp, i32 %sub, i32 %a<br>
> + ret i32 %cond<br>
> +}<br>
> +define i32 @m(i32 %a, i32 %b) nounwind {<br>
> +entry:<br>
> +; CHECK: m:<br>
> +; CHECK-NOT: cmp<br>
> + %cmp = icmp sgt i32 %a, %b<br>
> + %sub = sub nsw i32 %a, %b<br>
> + %cond = select i1 %cmp, i32 %b, i32 %sub<br>
> + ret i32 %cond<br>
> +}<br>
> ; <a href="rdar://11540023">rdar://11540023</a><br>
> define i64 @n(i64 %x, i64 %y) nounwind {<br>
> entry:<br>
><br>
><br>
> _______________________________________________<br>
> llvm-commits mailing list<br>
> <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>
> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br></p>
</blockquote></div><br></div></div>_______________________________________________<br>llvm-commits mailing list<br><a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits<br></blockquote></div><br></div></body></html>