[llvm-commits] [llvm] r160454 - in /llvm/trunk: lib/Target/X86/X86InstrArithmetic.td lib/Target/X86/X86InstrInfo.cpp test/CodeGen/X86/jump_sign.ll

Manman Ren mren at apple.com
Mon Sep 17 12:18:13 PDT 2012


On Sep 17, 2012, at 11:43 AM, Jan Voung <jvoung at google.com> wrote:

> Without the setIsDead(false), I get failures on:
> 
> * the CodeGen/X86/crash.ll test (unmodified)
> * the CodeGen/X86/jump_sign.ll test (modified to have -verify-machineinstrs)
> 
> 
> 
> 
> Here is the log from jump_sign.ll:
> 
> =======================
> 
> ********************
> FAIL: LLVM :: CodeGen/X86/jump_sign.ll (2769 of 6471)
> ******************** TEST 'LLVM :: CodeGen/X86/jump_sign.ll' FAILED ********************
> Script:
> --
> /build/Release+Asserts/bin/llc < /src/test/CodeGen/X86/jump_sign.ll -march=x86 -mcpu=pentiumpro -verify-machineinstrs | /build/Release+Asserts/bin/FileCheck /src/llvm/test/CodeGen/X86/jump_sign.ll
> --
> Exit Code: 1
> Command Output (stderr):
> --
> 
> # After codegen peephole optimization pass
> # Machine code for function h: SSA
> Frame Objects:
>   fi#-2: size=4, align=4, fixed, at location [SP+8]
>   fi#-1: size=4, align=16, fixed, at location [SP+4]
> Function Live Outs: %EAX
> 
> BB#0: derived from LLVM BB %entry
> 	%vreg0<def> = MOV32rm <fi#-2>, 1, %noreg, 0, %noreg; mem:LD4[FixedStack-2] GR32:%vreg0
> 	%vreg1<def> = MOV32rm <fi#-1>, 1, %noreg, 0, %noreg; mem:LD4[FixedStack-1](align=16) GR32:%vreg1
> 	%vreg3<def> = MOV32r0 %EFLAGS<imp-def,dead>; GR32:%vreg3
> 	%vreg2<def,tied1> = SUB32rr %vreg1<tied0>, %vreg0, %EFLAGS<imp-def,dead>; GR32:%vreg2,%vreg1,%vreg0
> 	%vreg5<def,tied1> = CMOVG32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-use>; GR32:%vreg5,%vreg3,%vreg2
> 	%EAX<def> = COPY %vreg5; GR32:%vreg5
> 	RET
> 
> # End machine code for function h.
> 
> *** Bad machine code: Using an undefined physical register ***
> - function:    h
> - basic block: BB#0 entry (0x3194ab0)
> - instruction: %vreg5<def,tied1> = CMOVG32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-use>; GR32:%vreg5,%vreg3,%vreg2
> - operand 3:   %EFLAGS<imp-use>
> LLVM ERROR: Found 1 machine code errors.
> FileCheck error: '-' is empty.
> 
> 
> =======================
> 
> "-debug" shows this bit of text before the error:
> 
> # Machine code for function h: SSA
> Frame Objects:
>   fi#-2: size=4, align=4, fixed, at location [SP+8]
>   fi#-1: size=4, align=16, fixed, at location [SP+4]
> Function Live Outs: %EAX
> 
> 0B      BB#0: derived from LLVM BB %entry
> 16B             %vreg0<def> = MOV32rm <fi#-2>, 1, %noreg, 0, %noreg; mem:LD4[FixedStack-2] GR32:%vreg0
> 32B             %vreg1<def> = MOV32rm <fi#-1>, 1, %noreg, 0, %noreg; mem:LD4[FixedStack-1](align=16) GR32:%vreg1
> 48B             %vreg2<def,tied1> = SUB32rr %vreg1<tied0>, %vreg0, %EFLAGS<imp-def,dead>; GR32:%vreg2,%vreg1,%vreg0
> 64B             %vreg3<def> = MOV32r0 %EFLAGS<imp-def,dead>; GR32:%vreg3
> 80B             %vreg4<def,tied1> = SUB32rr %vreg0<tied0>, %vreg1, %EFLAGS<imp-def>; GR32:%vreg4,%vreg0,%vreg1
> 96B             %vreg5<def,tied1> = CMOVL32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-use>; GR32:%vreg5,%vreg3,%vreg2
> 112B            %EAX<def> = COPY %vreg5; GR32:%vreg5
> 128B            RET
> 
> # End machine code for function h.
> 
> =======================
> 
> 
> So before the pass runs, there was another SUB32rr which redefined EFLAGS and that is the def that is live for the CMOV.  I think once you remove that redundant definition in favor of the earlier one, you have to make the earlier one live?
You are right.  Looks like setting IsDead to false is necessary.
Your patch looks good to me.

Thanks
Manman
> 
> - Jan
> 
> 
> 
> On Mon, Sep 17, 2012 at 11:21 AM, Manman Ren <mren at apple.com> wrote:
> 
> Hi Jan,
> 
> The patch looks good except it is weird that you have to set IsDead to false right after setting IsDef to true.
> Could you look deeper why that is needed? Can you dump the related info?
> 
> Thanks,
> Manman
> 
> On Sep 16, 2012, at 4:06 PM, Jan Voung <jvoung at google.com> wrote:
> 
>> Thanks, here's a patch that adds that.
>> 
>> One of the "make check" tests that used "-verify-machineinstrs" failed, showing that the def of EFLAGS from a DEC was dead before the new use.  I ended up adding a setIsDead(false) right after the setIsDef(true).  Does that seem alright?
>> 
>> -Jan
>> 
>> 
>> On Fri, Sep 14, 2012 at 12:55 PM, Manman Ren <mren at apple.com> wrote:
>> 
>> On Sep 14, 2012, at 11:27 AM, Jan Voung <jvoung at google.com> wrote:
>> 
>>> 
>>> On Wed, Jul 18, 2012 at 2:40 PM, Manman Ren <mren at apple.com> wrote:
>>> Author: mren
>>> Date: Wed Jul 18 16:40:01 2012
>>> New Revision: 160454
>>> 
>>> URL: http://llvm.org/viewvc/llvm-project?rev=160454&view=rev
>>> Log:
>>> X86: remove redundant cmp against zero.
>>> 
>>> Updated OptimizeCompare in peephole to remove redundant cmp against zero.
>>> We only remove Compare if CF and OF are not used.
>>> 
>>> rdar://11855129
>>> 
>>> Modified:
>>>     llvm/trunk/lib/Target/X86/X86InstrArithmetic.td
>>>     llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
>>>     llvm/trunk/test/CodeGen/X86/jump_sign.ll
>>> 
>>> Modified: llvm/trunk/lib/Target/X86/X86InstrArithmetic.td
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrArithmetic.td?rev=160454&r1=160453&r2=160454&view=diff
>>> ==============================================================================
>>> --- llvm/trunk/lib/Target/X86/X86InstrArithmetic.td (original)
>>> +++ llvm/trunk/lib/Target/X86/X86InstrArithmetic.td Wed Jul 18 16:40:01 2012
>>> @@ -1156,7 +1156,7 @@
>>>  def X86testpat : PatFrag<(ops node:$lhs, node:$rhs),
>>>                           (X86cmp (and_su node:$lhs, node:$rhs), 0)>;
>>> 
>>> -let Defs = [EFLAGS] in {
>>> +let isCompare = 1, Defs = [EFLAGS] in {
>>>    let isCommutable = 1 in {
>>>      def TEST8rr  : BinOpRR_F<0x84, "test", Xi8 , X86testpat, MRMSrcReg>;
>>>      def TEST16rr : BinOpRR_F<0x84, "test", Xi16, X86testpat, MRMSrcReg>;
>>> 
>>> Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=160454&r1=160453&r2=160454&view=diff
>>> ==============================================================================
>>> --- llvm/trunk/lib/Target/X86/X86InstrInfo.cpp (original)
>>> +++ llvm/trunk/lib/Target/X86/X86InstrInfo.cpp Wed Jul 18 16:40:01 2012
>>> @@ -3046,6 +3046,17 @@
>>>      CmpMask = ~0;
>>>      CmpValue = 0;
>>>      return true;
>>> +  case X86::TEST8rr:
>>> +  case X86::TEST16rr:
>>> +  case X86::TEST32rr:
>>> +  case X86::TEST64rr:
>>> +    SrcReg = MI->getOperand(0).getReg();
>>> +    if (MI->getOperand(1).getReg() != SrcReg) return false;
>>> +    // Compare against zero.
>>> +    SrcReg2 = 0;
>>> +    CmpMask = ~0;
>>> +    CmpValue = 0;
>>> +    return true;
>>>    }
>>>    return false;
>>>  }
>>> @@ -3093,6 +3104,40 @@
>>>    return false;
>>>  }
>>> 
>>> +/// isDefConvertible - check whether the definition can be converted
>>> +/// to remove a comparison against zero.
>>> +inline static bool isDefConvertible(MachineInstr *MI) {
>>> +  switch (MI->getOpcode()) {
>>> +  default: return false;
>>> +  case X86::SUB64ri32: case X86::SUB64ri8: case X86::SUB32ri:
>>> +  case X86::SUB32ri8:  case X86::SUB16ri:  case X86::SUB16ri8:
>>> +  case X86::SUB8ri:    case X86::SUB64rr:  case X86::SUB32rr:
>>> +  case X86::SUB16rr:   case X86::SUB8rr:   case X86::SUB64rm:
>>> +  case X86::SUB32rm:   case X86::SUB16rm:  case X86::SUB8rm:
>>> +  case X86::ADD64ri32: case X86::ADD64ri8: case X86::ADD32ri:
>>> +  case X86::ADD32ri8:  case X86::ADD16ri:  case X86::ADD16ri8:
>>> +  case X86::ADD8ri:    case X86::ADD64rr:  case X86::ADD32rr:
>>> +  case X86::ADD16rr:   case X86::ADD8rr:   case X86::ADD64rm:
>>> +  case X86::ADD32rm:   case X86::ADD16rm:  case X86::ADD8rm:
>>> +  case X86::AND64ri32: case X86::AND64ri8: case X86::AND32ri:
>>> +  case X86::AND32ri8:  case X86::AND16ri:  case X86::AND16ri8:
>>> +  case X86::AND8ri:    case X86::AND64rr:  case X86::AND32rr:
>>> +  case X86::AND16rr:   case X86::AND8rr:   case X86::AND64rm:
>>> +  case X86::AND32rm:   case X86::AND16rm:  case X86::AND8rm:
>>> +  case X86::XOR64ri32: case X86::XOR64ri8: case X86::XOR32ri:
>>> +  case X86::XOR32ri8:  case X86::XOR16ri:  case X86::XOR16ri8:
>>> +  case X86::XOR8ri:    case X86::XOR64rr:  case X86::XOR32rr:
>>> +  case X86::XOR16rr:   case X86::XOR8rr:   case X86::XOR64rm:
>>> +  case X86::XOR32rm:   case X86::XOR16rm:  case X86::XOR8rm:
>>> +  case X86::OR64ri32:  case X86::OR64ri8:  case X86::OR32ri:
>>> +  case X86::OR32ri8:   case X86::OR16ri:   case X86::OR16ri8:
>>> +  case X86::OR8ri:     case X86::OR64rr:   case X86::OR32rr:
>>> +  case X86::OR16rr:    case X86::OR8rr:    case X86::OR64rm:
>>> +  case X86::OR32rm:    case X86::OR16rm:   case X86::OR8rm:
>>> 
>>> Hi Manman,
>>> 
>>> Does it make sense to add the DEC/INC variants to isDefConvertible too?
>>> Or, is that not safe?  I just noticed that if tests weren't optimized earlier
>>> (e.g., in X86ISelLowering::EmitTest()), then you could often end up with
>>> DEC/INCs at this point.
>> I think it makes sense and is safe to add "INC/DEC" here.
>> 
>> Thanks,
>> Manman
>> 
>>> 
>>> - Jan
>>>  
>>> +    return true;
>>> +  }
>>> +}
>>> +
>>>  /// optimizeCompareInstr - Check if there exists an earlier instruction that
>>>  /// operates on the same source operands and sets flags in the same way as
>>>  /// Compare; remove Compare if possible.
>>> @@ -3107,6 +3152,13 @@
>>>    // CmpInstr is the first instruction of the BB.
>>>    MachineBasicBlock::iterator I = CmpInstr, Def = MI;
>>> 
>>> +  // If we are comparing against zero, check whether we can use MI to update
>>> +  // EFLAGS. If MI is not in the same BB as CmpInstr, do not optimize.
>>> +  bool IsCmpZero = (SrcReg2 == 0 && CmpValue == 0);
>>> +  if (IsCmpZero && (MI->getParent() != CmpInstr->getParent() ||
>>> +      !isDefConvertible(MI)))
>>> +    return false;
>>> +
>>>    // We are searching for an earlier instruction that can make CmpInstr
>>>    // redundant and that instruction will be saved in Sub.
>>>    MachineInstr *Sub = NULL;
>>> @@ -3126,7 +3178,8 @@
>>>    for (; RI != RE; ++RI) {
>>>      MachineInstr *Instr = &*RI;
>>>      // Check whether CmpInstr can be made redundant by the current instruction.
>>> -    if (isRedundantFlagInstr(CmpInstr, SrcReg, SrcReg2, CmpValue, Instr)) {
>>> +    if (!IsCmpZero &&
>>> +        isRedundantFlagInstr(CmpInstr, SrcReg, SrcReg2, CmpValue, Instr)) {
>>>        Sub = Instr;
>>>        break;
>>>      }
>>> @@ -3153,7 +3206,7 @@
>>>    }
>>> 
>>>    // Return false if no candidates exist.
>>> -  if (!Sub)
>>> +  if (!IsCmpZero && !Sub)
>>>      return false;
>>> 
>>>    bool IsSwapped = (SrcReg2 != 0 && Sub->getOperand(1).getReg() == SrcReg2 &&
>>> @@ -3177,13 +3230,10 @@
>>>        continue;
>>> 
>>>      // EFLAGS is used by this instruction.
>>> -    if (IsSwapped) {
>>> -      // If we have SUB(r1, r2) and CMP(r2, r1), the condition code needs
>>> -      // to be changed from r2 > r1 to r1 < r2, from r2 < r1 to r1 > r2, etc.
>>> -      // We decode the condition code from opcode, swap the condition code,
>>> -      // and synthesize the new opcode.
>>> -      bool OpcIsSET = false;
>>> -      X86::CondCode OldCC;
>>> +    X86::CondCode OldCC;
>>> +    bool OpcIsSET = false;
>>> +    if (IsCmpZero || IsSwapped) {
>>> +      // We decode the condition code from opcode.
>>>        if (Instr.isBranch())
>>>          OldCC = getCondFromBranchOpc(Instr.getOpcode());
>>>        else {
>>> @@ -3194,6 +3244,22 @@
>>>            OldCC = getCondFromCMovOpc(Instr.getOpcode());
>>>        }
>>>        if (OldCC == X86::COND_INVALID) return false;
>>> +    }
>>> +    if (IsCmpZero) {
>>> +      switch (OldCC) {
>>> +      default: break;
>>> +      case X86::COND_A: case X86::COND_AE:
>>> +      case X86::COND_B: case X86::COND_BE:
>>> +      case X86::COND_G: case X86::COND_GE:
>>> +      case X86::COND_L: case X86::COND_LE:
>>> +      case X86::COND_O: case X86::COND_NO:
>>> +        // CF and OF are used, we can't perform this optimization.
>>> +        return false;
>>> +      }
>>> +    } else if (IsSwapped) {
>>> +      // If we have SUB(r1, r2) and CMP(r2, r1), the condition code needs
>>> +      // to be changed from r2 > r1 to r1 < r2, from r2 < r1 to r1 > r2, etc.
>>> +      // We swap the condition code and synthesize the new opcode.
>>>        X86::CondCode NewCC = getSwappedCondition(OldCC);
>>>        if (NewCC == X86::COND_INVALID) return false;
>>> 
>>> @@ -3223,7 +3289,7 @@
>>> 
>>>    // If EFLAGS is not killed nor re-defined, we should check whether it is
>>>    // live-out. If it is live-out, do not optimize.
>>> -  if (IsSwapped && !IsSafe) {
>>> +  if ((IsCmpZero || IsSwapped) && !IsSafe) {
>>>      MachineBasicBlock *MBB = CmpInstr->getParent();
>>>      for (MachineBasicBlock::succ_iterator SI = MBB->succ_begin(),
>>>               SE = MBB->succ_end(); SI != SE; ++SI)
>>> @@ -3231,6 +3297,8 @@
>>>          return false;
>>>    }
>>> 
>>> +  // The instruction to be updated is either Sub or MI.
>>> +  Sub = IsCmpZero ? MI : Sub;
>>>    // Move Movr0Inst to the place right before Sub.
>>>    if (Movr0Inst) {
>>>      Sub->getParent()->remove(Movr0Inst);
>>> @@ -3238,10 +3306,11 @@
>>>    }
>>> 
>>>    // Make sure Sub instruction defines EFLAGS.
>>> -  assert(Sub->getNumOperands() >= 4 && Sub->getOperand(3).isReg() &&
>>> -         Sub->getOperand(3).getReg() == X86::EFLAGS &&
>>> -         "EFLAGS should be the 4th operand of SUBrr or SUBri.");
>>> -  Sub->getOperand(3).setIsDef(true);
>>> +  assert(Sub->getNumOperands() >= 2 &&
>>> +         Sub->getOperand(Sub->getNumOperands()-1).isReg() &&
>>> +         Sub->getOperand(Sub->getNumOperands()-1).getReg() == X86::EFLAGS &&
>>> +         "EFLAGS should be the last operand of SUB, ADD, OR, XOR, AND");
>>> +  Sub->getOperand(Sub->getNumOperands()-1).setIsDef(true);
>>>    CmpInstr->eraseFromParent();
>>> 
>>>    // Modify the condition code of instructions in OpsToUpdate.
>>> 
>>> Modified: llvm/trunk/test/CodeGen/X86/jump_sign.ll
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/jump_sign.ll?rev=160454&r1=160453&r2=160454&view=diff
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/jump_sign.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/jump_sign.ll Wed Jul 18 16:40:01 2012
>>> @@ -202,3 +202,14 @@
>>>  if.else.i104:                                     ; preds = %if.then44
>>>    ret void
>>>  }
>>> +; rdar://11855129
>>> +define i32 @p(i32 %a, i32 %b) nounwind {
>>> +entry:
>>> +; CHECK: p:
>>> +; CHECK-NOT: test
>>> +; CHECK: cmovs
>>> +  %add = add nsw i32 %b, %a
>>> +  %cmp = icmp sgt i32 %add, 0
>>> +  %add. = select i1 %cmp, i32 %add, i32 0
>>> +  ret i32 %add.
>>> +}
>>> 
>>> 
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>> 
>> 
>> 
>> <cmp_dec_inc.patch>
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120917/d68b0181/attachment.html>


More information about the llvm-commits mailing list