[llvm-commits] [llvm] r160454 - in /llvm/trunk: lib/Target/X86/X86InstrArithmetic.td lib/Target/X86/X86InstrInfo.cpp test/CodeGen/X86/jump_sign.ll
Jan Voung
jvoung at google.com
Mon Sep 17 11:43:28 PDT 2012
Without the setIsDead(false), I get failures on:
* the CodeGen/X86/crash.ll test (unmodified)
* the CodeGen/X86/jump_sign.ll test (modified to have -verify-machineinstrs)
Here is the log from jump_sign.ll:
=======================
********************
FAIL: LLVM :: CodeGen/X86/jump_sign.ll (2769 of 6471)
******************** TEST 'LLVM :: CodeGen/X86/jump_sign.ll' FAILED
********************
Script:
--
/build/Release+Asserts/bin/llc < /src/test/CodeGen/X86/jump_sign.ll
-march=x86 -mcpu=pentiumpro -verify-machineinstrs |
/build/Release+Asserts/bin/FileCheck /src/llvm/test/CodeGen/X86/jump_sign.ll
--
Exit Code: 1
Command Output (stderr):
--
# After codegen peephole optimization pass
# Machine code for function h: SSA
Frame Objects:
fi#-2: size=4, align=4, fixed, at location [SP+8]
fi#-1: size=4, align=16, fixed, at location [SP+4]
Function Live Outs: %EAX
BB#0: derived from LLVM BB %entry
%vreg0<def> = MOV32rm <fi#-2>, 1, %noreg, 0, %noreg; mem:LD4[FixedStack-2]
GR32:%vreg0
%vreg1<def> = MOV32rm <fi#-1>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-1](align=16) GR32:%vreg1
%vreg3<def> = MOV32r0 %EFLAGS<imp-def,dead>; GR32:%vreg3
%vreg2<def,tied1> = SUB32rr %vreg1<tied0>, %vreg0, %EFLAGS<imp-def,dead>;
GR32:%vreg2,%vreg1,%vreg0
%vreg5<def,tied1> = CMOVG32rr %vreg3<tied0>, %vreg2<kill>,
%EFLAGS<imp-use>; GR32:%vreg5,%vreg3,%vreg2
%EAX<def> = COPY %vreg5; GR32:%vreg5
RET
# End machine code for function h.
*** Bad machine code: Using an undefined physical register ***
- function: h
- basic block: BB#0 entry (0x3194ab0)
- instruction: %vreg5<def,tied1> = CMOVG32rr %vreg3<tied0>, %vreg2<kill>,
%EFLAGS<imp-use>; GR32:%vreg5,%vreg3,%vreg2
- operand 3: %EFLAGS<imp-use>
LLVM ERROR: Found 1 machine code errors.
FileCheck error: '-' is empty.
=======================
"-debug" shows this bit of text before the error:
# Machine code for function h: SSA
Frame Objects:
fi#-2: size=4, align=4, fixed, at location [SP+8]
fi#-1: size=4, align=16, fixed, at location [SP+4]
Function Live Outs: %EAX
0B BB#0: derived from LLVM BB %entry
16B %vreg0<def> = MOV32rm <fi#-2>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-2] GR32:%vreg0
32B %vreg1<def> = MOV32rm <fi#-1>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-1](align=16) GR32:%vreg1
48B %vreg2<def,tied1> = SUB32rr %vreg1<tied0>, %vreg0,
%EFLAGS<imp-def,dead>; GR32:%vreg2,%vreg1,%vreg0
64B %vreg3<def> = MOV32r0 %EFLAGS<imp-def,dead>; GR32:%vreg3
80B %vreg4<def,tied1> = SUB32rr %vreg0<tied0>, %vreg1,
%EFLAGS<imp-def>; GR32:%vreg4,%vreg0,%vreg1
96B %vreg5<def,tied1> = CMOVL32rr %vreg3<tied0>, %vreg2<kill>,
%EFLAGS<imp-use>; GR32:%vreg5,%vreg3,%vreg2
112B %EAX<def> = COPY %vreg5; GR32:%vreg5
128B RET
# End machine code for function h.
=======================
So before the pass runs, there was another SUB32rr which redefined EFLAGS
and that is the def that is live for the CMOV. I think once you remove
that redundant definition in favor of the earlier one, you have to make the
earlier one live?
- Jan
On Mon, Sep 17, 2012 at 11:21 AM, Manman Ren <mren at apple.com> wrote:
>
> Hi Jan,
>
> The patch looks good except it is weird that you have to set IsDead to
> false right after setting IsDef to true.
> Could you look deeper why that is needed? Can you dump the related info?
>
> Thanks,
> Manman
>
> On Sep 16, 2012, at 4:06 PM, Jan Voung <jvoung at google.com> wrote:
>
> Thanks, here's a patch that adds that.
>
> One of the "make check" tests that used "-verify-machineinstrs" failed,
> showing that the def of EFLAGS from a DEC was dead before the new use. I
> ended up adding a setIsDead(false) right after the setIsDef(true). Does
> that seem alright?
>
> -Jan
>
>
> On Fri, Sep 14, 2012 at 12:55 PM, Manman Ren <mren at apple.com> wrote:
>
>>
>> On Sep 14, 2012, at 11:27 AM, Jan Voung <jvoung at google.com> wrote:
>>
>>
>> On Wed, Jul 18, 2012 at 2:40 PM, Manman Ren <mren at apple.com> wrote:
>>
>>> Author: mren
>>> Date: Wed Jul 18 16:40:01 2012
>>> New Revision: 160454
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=160454&view=rev
>>> Log:
>>> X86: remove redundant cmp against zero.
>>>
>>> Updated OptimizeCompare in peephole to remove redundant cmp against zero.
>>> We only remove Compare if CF and OF are not used.
>>>
>>> rdar://11855129
>>>
>>> Modified:
>>> llvm/trunk/lib/Target/X86/X86InstrArithmetic.td
>>> llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
>>> llvm/trunk/test/CodeGen/X86/jump_sign.ll
>>>
>>> Modified: llvm/trunk/lib/Target/X86/X86InstrArithmetic.td
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrArithmetic.td?rev=160454&r1=160453&r2=160454&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Target/X86/X86InstrArithmetic.td (original)
>>> +++ llvm/trunk/lib/Target/X86/X86InstrArithmetic.td Wed Jul 18 16:40:01
>>> 2012
>>> @@ -1156,7 +1156,7 @@
>>> def X86testpat : PatFrag<(ops node:$lhs, node:$rhs),
>>> (X86cmp (and_su node:$lhs, node:$rhs), 0)>;
>>>
>>> -let Defs = [EFLAGS] in {
>>> +let isCompare = 1, Defs = [EFLAGS] in {
>>> let isCommutable = 1 in {
>>> def TEST8rr : BinOpRR_F<0x84, "test", Xi8 , X86testpat, MRMSrcReg>;
>>> def TEST16rr : BinOpRR_F<0x84, "test", Xi16, X86testpat, MRMSrcReg>;
>>>
>>> Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=160454&r1=160453&r2=160454&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Target/X86/X86InstrInfo.cpp (original)
>>> +++ llvm/trunk/lib/Target/X86/X86InstrInfo.cpp Wed Jul 18 16:40:01 2012
>>> @@ -3046,6 +3046,17 @@
>>> CmpMask = ~0;
>>> CmpValue = 0;
>>> return true;
>>> + case X86::TEST8rr:
>>> + case X86::TEST16rr:
>>> + case X86::TEST32rr:
>>> + case X86::TEST64rr:
>>> + SrcReg = MI->getOperand(0).getReg();
>>> + if (MI->getOperand(1).getReg() != SrcReg) return false;
>>> + // Compare against zero.
>>> + SrcReg2 = 0;
>>> + CmpMask = ~0;
>>> + CmpValue = 0;
>>> + return true;
>>> }
>>> return false;
>>> }
>>> @@ -3093,6 +3104,40 @@
>>> return false;
>>> }
>>>
>>> +/// isDefConvertible - check whether the definition can be converted
>>> +/// to remove a comparison against zero.
>>> +inline static bool isDefConvertible(MachineInstr *MI) {
>>> + switch (MI->getOpcode()) {
>>> + default: return false;
>>> + case X86::SUB64ri32: case X86::SUB64ri8: case X86::SUB32ri:
>>> + case X86::SUB32ri8: case X86::SUB16ri: case X86::SUB16ri8:
>>> + case X86::SUB8ri: case X86::SUB64rr: case X86::SUB32rr:
>>> + case X86::SUB16rr: case X86::SUB8rr: case X86::SUB64rm:
>>> + case X86::SUB32rm: case X86::SUB16rm: case X86::SUB8rm:
>>> + case X86::ADD64ri32: case X86::ADD64ri8: case X86::ADD32ri:
>>> + case X86::ADD32ri8: case X86::ADD16ri: case X86::ADD16ri8:
>>> + case X86::ADD8ri: case X86::ADD64rr: case X86::ADD32rr:
>>> + case X86::ADD16rr: case X86::ADD8rr: case X86::ADD64rm:
>>> + case X86::ADD32rm: case X86::ADD16rm: case X86::ADD8rm:
>>> + case X86::AND64ri32: case X86::AND64ri8: case X86::AND32ri:
>>> + case X86::AND32ri8: case X86::AND16ri: case X86::AND16ri8:
>>> + case X86::AND8ri: case X86::AND64rr: case X86::AND32rr:
>>> + case X86::AND16rr: case X86::AND8rr: case X86::AND64rm:
>>> + case X86::AND32rm: case X86::AND16rm: case X86::AND8rm:
>>> + case X86::XOR64ri32: case X86::XOR64ri8: case X86::XOR32ri:
>>> + case X86::XOR32ri8: case X86::XOR16ri: case X86::XOR16ri8:
>>> + case X86::XOR8ri: case X86::XOR64rr: case X86::XOR32rr:
>>> + case X86::XOR16rr: case X86::XOR8rr: case X86::XOR64rm:
>>> + case X86::XOR32rm: case X86::XOR16rm: case X86::XOR8rm:
>>> + case X86::OR64ri32: case X86::OR64ri8: case X86::OR32ri:
>>> + case X86::OR32ri8: case X86::OR16ri: case X86::OR16ri8:
>>> + case X86::OR8ri: case X86::OR64rr: case X86::OR32rr:
>>> + case X86::OR16rr: case X86::OR8rr: case X86::OR64rm:
>>> + case X86::OR32rm: case X86::OR16rm: case X86::OR8rm:
>>>
>>
>> Hi Manman,
>>
>> Does it make sense to add the DEC/INC variants to isDefConvertible too?
>> Or, is that not safe? I just noticed that if tests weren't optimized
>> earlier
>> (e.g., in X86ISelLowering::EmitTest()), then you could often end up with
>> DEC/INCs at this point.
>>
>> I think it makes sense and is safe to add "INC/DEC" here.
>>
>> Thanks,
>> Manman
>>
>>
>> - Jan
>>
>>
>>> + return true;
>>> + }
>>> +}
>>> +
>>> /// optimizeCompareInstr - Check if there exists an earlier instruction
>>> that
>>> /// operates on the same source operands and sets flags in the same way
>>> as
>>> /// Compare; remove Compare if possible.
>>> @@ -3107,6 +3152,13 @@
>>> // CmpInstr is the first instruction of the BB.
>>> MachineBasicBlock::iterator I = CmpInstr, Def = MI;
>>>
>>> + // If we are comparing against zero, check whether we can use MI to
>>> update
>>> + // EFLAGS. If MI is not in the same BB as CmpInstr, do not optimize.
>>> + bool IsCmpZero = (SrcReg2 == 0 && CmpValue == 0);
>>> + if (IsCmpZero && (MI->getParent() != CmpInstr->getParent() ||
>>> + !isDefConvertible(MI)))
>>> + return false;
>>> +
>>> // We are searching for an earlier instruction that can make CmpInstr
>>> // redundant and that instruction will be saved in Sub.
>>> MachineInstr *Sub = NULL;
>>> @@ -3126,7 +3178,8 @@
>>> for (; RI != RE; ++RI) {
>>> MachineInstr *Instr = &*RI;
>>> // Check whether CmpInstr can be made redundant by the current
>>> instruction.
>>> - if (isRedundantFlagInstr(CmpInstr, SrcReg, SrcReg2, CmpValue,
>>> Instr)) {
>>> + if (!IsCmpZero &&
>>> + isRedundantFlagInstr(CmpInstr, SrcReg, SrcReg2, CmpValue,
>>> Instr)) {
>>> Sub = Instr;
>>> break;
>>> }
>>> @@ -3153,7 +3206,7 @@
>>> }
>>>
>>> // Return false if no candidates exist.
>>> - if (!Sub)
>>> + if (!IsCmpZero && !Sub)
>>> return false;
>>>
>>> bool IsSwapped = (SrcReg2 != 0 && Sub->getOperand(1).getReg() ==
>>> SrcReg2 &&
>>> @@ -3177,13 +3230,10 @@
>>> continue;
>>>
>>> // EFLAGS is used by this instruction.
>>> - if (IsSwapped) {
>>> - // If we have SUB(r1, r2) and CMP(r2, r1), the condition code
>>> needs
>>> - // to be changed from r2 > r1 to r1 < r2, from r2 < r1 to r1 >
>>> r2, etc.
>>> - // We decode the condition code from opcode, swap the condition
>>> code,
>>> - // and synthesize the new opcode.
>>> - bool OpcIsSET = false;
>>> - X86::CondCode OldCC;
>>> + X86::CondCode OldCC;
>>> + bool OpcIsSET = false;
>>> + if (IsCmpZero || IsSwapped) {
>>> + // We decode the condition code from opcode.
>>> if (Instr.isBranch())
>>> OldCC = getCondFromBranchOpc(Instr.getOpcode());
>>> else {
>>> @@ -3194,6 +3244,22 @@
>>> OldCC = getCondFromCMovOpc(Instr.getOpcode());
>>> }
>>> if (OldCC == X86::COND_INVALID) return false;
>>> + }
>>> + if (IsCmpZero) {
>>> + switch (OldCC) {
>>> + default: break;
>>> + case X86::COND_A: case X86::COND_AE:
>>> + case X86::COND_B: case X86::COND_BE:
>>> + case X86::COND_G: case X86::COND_GE:
>>> + case X86::COND_L: case X86::COND_LE:
>>> + case X86::COND_O: case X86::COND_NO:
>>> + // CF and OF are used, we can't perform this optimization.
>>> + return false;
>>> + }
>>> + } else if (IsSwapped) {
>>> + // If we have SUB(r1, r2) and CMP(r2, r1), the condition code
>>> needs
>>> + // to be changed from r2 > r1 to r1 < r2, from r2 < r1 to r1 >
>>> r2, etc.
>>> + // We swap the condition code and synthesize the new opcode.
>>> X86::CondCode NewCC = getSwappedCondition(OldCC);
>>> if (NewCC == X86::COND_INVALID) return false;
>>>
>>> @@ -3223,7 +3289,7 @@
>>>
>>> // If EFLAGS is not killed nor re-defined, we should check whether it
>>> is
>>> // live-out. If it is live-out, do not optimize.
>>> - if (IsSwapped && !IsSafe) {
>>> + if ((IsCmpZero || IsSwapped) && !IsSafe) {
>>> MachineBasicBlock *MBB = CmpInstr->getParent();
>>> for (MachineBasicBlock::succ_iterator SI = MBB->succ_begin(),
>>> SE = MBB->succ_end(); SI != SE; ++SI)
>>> @@ -3231,6 +3297,8 @@
>>> return false;
>>> }
>>>
>>> + // The instruction to be updated is either Sub or MI.
>>> + Sub = IsCmpZero ? MI : Sub;
>>> // Move Movr0Inst to the place right before Sub.
>>> if (Movr0Inst) {
>>> Sub->getParent()->remove(Movr0Inst);
>>> @@ -3238,10 +3306,11 @@
>>> }
>>>
>>> // Make sure Sub instruction defines EFLAGS.
>>> - assert(Sub->getNumOperands() >= 4 && Sub->getOperand(3).isReg() &&
>>> - Sub->getOperand(3).getReg() == X86::EFLAGS &&
>>> - "EFLAGS should be the 4th operand of SUBrr or SUBri.");
>>> - Sub->getOperand(3).setIsDef(true);
>>> + assert(Sub->getNumOperands() >= 2 &&
>>> + Sub->getOperand(Sub->getNumOperands()-1).isReg() &&
>>> + Sub->getOperand(Sub->getNumOperands()-1).getReg() ==
>>> X86::EFLAGS &&
>>> + "EFLAGS should be the last operand of SUB, ADD, OR, XOR, AND");
>>> + Sub->getOperand(Sub->getNumOperands()-1).setIsDef(true);
>>> CmpInstr->eraseFromParent();
>>>
>>> // Modify the condition code of instructions in OpsToUpdate.
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/jump_sign.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/jump_sign.ll?rev=160454&r1=160453&r2=160454&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/jump_sign.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/jump_sign.ll Wed Jul 18 16:40:01 2012
>>> @@ -202,3 +202,14 @@
>>> if.else.i104: ; preds = %if.then44
>>> ret void
>>> }
>>> +; rdar://11855129
>>> +define i32 @p(i32 %a, i32 %b) nounwind {
>>> +entry:
>>> +; CHECK: p:
>>> +; CHECK-NOT: test
>>> +; CHECK: cmovs
>>> + %add = add nsw i32 %b, %a
>>> + %cmp = icmp sgt i32 %add, 0
>>> + %add. = select i1 %cmp, i32 %add, i32 0
>>> + ret i32 %add.
>>> +}
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>
>>
>>
> <cmp_dec_inc.patch>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120917/6f4021fe/attachment.html>
More information about the llvm-commits
mailing list