[llvm-commits] [llvm] r160454 - in /llvm/trunk: lib/Target/X86/X86InstrArithmetic.td lib/Target/X86/X86InstrInfo.cpp test/CodeGen/X86/jump_sign.ll

Jan Voung jvoung at google.com
Mon Sep 17 11:43:28 PDT 2012


Without the setIsDead(false), I get failures on:

* the CodeGen/X86/crash.ll test (unmodified)
* the CodeGen/X86/jump_sign.ll test (modified to have -verify-machineinstrs)




Here is the log from jump_sign.ll:

=======================

********************
FAIL: LLVM :: CodeGen/X86/jump_sign.ll (2769 of 6471)
******************** TEST 'LLVM :: CodeGen/X86/jump_sign.ll' FAILED
********************
Script:
--
/build/Release+Asserts/bin/llc < /src/test/CodeGen/X86/jump_sign.ll
-march=x86 -mcpu=pentiumpro -verify-machineinstrs |
/build/Release+Asserts/bin/FileCheck /src/llvm/test/CodeGen/X86/jump_sign.ll
--
Exit Code: 1
Command Output (stderr):
--

# After codegen peephole optimization pass
# Machine code for function h: SSA
Frame Objects:
  fi#-2: size=4, align=4, fixed, at location [SP+8]
  fi#-1: size=4, align=16, fixed, at location [SP+4]
Function Live Outs: %EAX

BB#0: derived from LLVM BB %entry
%vreg0<def> = MOV32rm <fi#-2>, 1, %noreg, 0, %noreg; mem:LD4[FixedStack-2]
GR32:%vreg0
%vreg1<def> = MOV32rm <fi#-1>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-1](align=16) GR32:%vreg1
%vreg3<def> = MOV32r0 %EFLAGS<imp-def,dead>; GR32:%vreg3
%vreg2<def,tied1> = SUB32rr %vreg1<tied0>, %vreg0, %EFLAGS<imp-def,dead>;
GR32:%vreg2,%vreg1,%vreg0
%vreg5<def,tied1> = CMOVG32rr %vreg3<tied0>, %vreg2<kill>,
%EFLAGS<imp-use>; GR32:%vreg5,%vreg3,%vreg2
%EAX<def> = COPY %vreg5; GR32:%vreg5
RET

# End machine code for function h.

*** Bad machine code: Using an undefined physical register ***
- function:    h
- basic block: BB#0 entry (0x3194ab0)
- instruction: %vreg5<def,tied1> = CMOVG32rr %vreg3<tied0>, %vreg2<kill>,
%EFLAGS<imp-use>; GR32:%vreg5,%vreg3,%vreg2
- operand 3:   %EFLAGS<imp-use>
LLVM ERROR: Found 1 machine code errors.
FileCheck error: '-' is empty.


=======================

"-debug" shows this bit of text before the error:

# Machine code for function h: SSA
Frame Objects:
  fi#-2: size=4, align=4, fixed, at location [SP+8]
  fi#-1: size=4, align=16, fixed, at location [SP+4]
Function Live Outs: %EAX

0B      BB#0: derived from LLVM BB %entry
16B             %vreg0<def> = MOV32rm <fi#-2>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-2] GR32:%vreg0
32B             %vreg1<def> = MOV32rm <fi#-1>, 1, %noreg, 0, %noreg;
mem:LD4[FixedStack-1](align=16) GR32:%vreg1
48B             %vreg2<def,tied1> = SUB32rr %vreg1<tied0>, %vreg0,
%EFLAGS<imp-def,dead>; GR32:%vreg2,%vreg1,%vreg0
64B             %vreg3<def> = MOV32r0 %EFLAGS<imp-def,dead>; GR32:%vreg3
80B             %vreg4<def,tied1> = SUB32rr %vreg0<tied0>, %vreg1,
%EFLAGS<imp-def>; GR32:%vreg4,%vreg0,%vreg1
96B             %vreg5<def,tied1> = CMOVL32rr %vreg3<tied0>, %vreg2<kill>,
%EFLAGS<imp-use>; GR32:%vreg5,%vreg3,%vreg2
112B            %EAX<def> = COPY %vreg5; GR32:%vreg5
128B            RET

# End machine code for function h.

=======================


So before the pass runs, there was another SUB32rr which redefined EFLAGS
and that is the def that is live for the CMOV.  I think once you remove
that redundant definition in favor of the earlier one, you have to make the
earlier one live?

- Jan



On Mon, Sep 17, 2012 at 11:21 AM, Manman Ren <mren at apple.com> wrote:

>
> Hi Jan,
>
> The patch looks good except it is weird that you have to set IsDead to
> false right after setting IsDef to true.
> Could you look deeper why that is needed? Can you dump the related info?
>
> Thanks,
> Manman
>
> On Sep 16, 2012, at 4:06 PM, Jan Voung <jvoung at google.com> wrote:
>
> Thanks, here's a patch that adds that.
>
> One of the "make check" tests that used "-verify-machineinstrs" failed,
> showing that the def of EFLAGS from a DEC was dead before the new use.  I
> ended up adding a setIsDead(false) right after the setIsDef(true).  Does
> that seem alright?
>
> -Jan
>
>
> On Fri, Sep 14, 2012 at 12:55 PM, Manman Ren <mren at apple.com> wrote:
>
>>
>> On Sep 14, 2012, at 11:27 AM, Jan Voung <jvoung at google.com> wrote:
>>
>>
>> On Wed, Jul 18, 2012 at 2:40 PM, Manman Ren <mren at apple.com> wrote:
>>
>>> Author: mren
>>> Date: Wed Jul 18 16:40:01 2012
>>> New Revision: 160454
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=160454&view=rev
>>> Log:
>>> X86: remove redundant cmp against zero.
>>>
>>> Updated OptimizeCompare in peephole to remove redundant cmp against zero.
>>> We only remove Compare if CF and OF are not used.
>>>
>>> rdar://11855129
>>>
>>> Modified:
>>>     llvm/trunk/lib/Target/X86/X86InstrArithmetic.td
>>>     llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
>>>     llvm/trunk/test/CodeGen/X86/jump_sign.ll
>>>
>>> Modified: llvm/trunk/lib/Target/X86/X86InstrArithmetic.td
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrArithmetic.td?rev=160454&r1=160453&r2=160454&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Target/X86/X86InstrArithmetic.td (original)
>>> +++ llvm/trunk/lib/Target/X86/X86InstrArithmetic.td Wed Jul 18 16:40:01
>>> 2012
>>> @@ -1156,7 +1156,7 @@
>>>  def X86testpat : PatFrag<(ops node:$lhs, node:$rhs),
>>>                           (X86cmp (and_su node:$lhs, node:$rhs), 0)>;
>>>
>>> -let Defs = [EFLAGS] in {
>>> +let isCompare = 1, Defs = [EFLAGS] in {
>>>    let isCommutable = 1 in {
>>>      def TEST8rr  : BinOpRR_F<0x84, "test", Xi8 , X86testpat, MRMSrcReg>;
>>>      def TEST16rr : BinOpRR_F<0x84, "test", Xi16, X86testpat, MRMSrcReg>;
>>>
>>> Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=160454&r1=160453&r2=160454&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/lib/Target/X86/X86InstrInfo.cpp (original)
>>> +++ llvm/trunk/lib/Target/X86/X86InstrInfo.cpp Wed Jul 18 16:40:01 2012
>>> @@ -3046,6 +3046,17 @@
>>>      CmpMask = ~0;
>>>      CmpValue = 0;
>>>      return true;
>>> +  case X86::TEST8rr:
>>> +  case X86::TEST16rr:
>>> +  case X86::TEST32rr:
>>> +  case X86::TEST64rr:
>>> +    SrcReg = MI->getOperand(0).getReg();
>>> +    if (MI->getOperand(1).getReg() != SrcReg) return false;
>>> +    // Compare against zero.
>>> +    SrcReg2 = 0;
>>> +    CmpMask = ~0;
>>> +    CmpValue = 0;
>>> +    return true;
>>>    }
>>>    return false;
>>>  }
>>> @@ -3093,6 +3104,40 @@
>>>    return false;
>>>  }
>>>
>>> +/// isDefConvertible - check whether the definition can be converted
>>> +/// to remove a comparison against zero.
>>> +inline static bool isDefConvertible(MachineInstr *MI) {
>>> +  switch (MI->getOpcode()) {
>>> +  default: return false;
>>> +  case X86::SUB64ri32: case X86::SUB64ri8: case X86::SUB32ri:
>>> +  case X86::SUB32ri8:  case X86::SUB16ri:  case X86::SUB16ri8:
>>> +  case X86::SUB8ri:    case X86::SUB64rr:  case X86::SUB32rr:
>>> +  case X86::SUB16rr:   case X86::SUB8rr:   case X86::SUB64rm:
>>> +  case X86::SUB32rm:   case X86::SUB16rm:  case X86::SUB8rm:
>>> +  case X86::ADD64ri32: case X86::ADD64ri8: case X86::ADD32ri:
>>> +  case X86::ADD32ri8:  case X86::ADD16ri:  case X86::ADD16ri8:
>>> +  case X86::ADD8ri:    case X86::ADD64rr:  case X86::ADD32rr:
>>> +  case X86::ADD16rr:   case X86::ADD8rr:   case X86::ADD64rm:
>>> +  case X86::ADD32rm:   case X86::ADD16rm:  case X86::ADD8rm:
>>> +  case X86::AND64ri32: case X86::AND64ri8: case X86::AND32ri:
>>> +  case X86::AND32ri8:  case X86::AND16ri:  case X86::AND16ri8:
>>> +  case X86::AND8ri:    case X86::AND64rr:  case X86::AND32rr:
>>> +  case X86::AND16rr:   case X86::AND8rr:   case X86::AND64rm:
>>> +  case X86::AND32rm:   case X86::AND16rm:  case X86::AND8rm:
>>> +  case X86::XOR64ri32: case X86::XOR64ri8: case X86::XOR32ri:
>>> +  case X86::XOR32ri8:  case X86::XOR16ri:  case X86::XOR16ri8:
>>> +  case X86::XOR8ri:    case X86::XOR64rr:  case X86::XOR32rr:
>>> +  case X86::XOR16rr:   case X86::XOR8rr:   case X86::XOR64rm:
>>> +  case X86::XOR32rm:   case X86::XOR16rm:  case X86::XOR8rm:
>>> +  case X86::OR64ri32:  case X86::OR64ri8:  case X86::OR32ri:
>>> +  case X86::OR32ri8:   case X86::OR16ri:   case X86::OR16ri8:
>>> +  case X86::OR8ri:     case X86::OR64rr:   case X86::OR32rr:
>>> +  case X86::OR16rr:    case X86::OR8rr:    case X86::OR64rm:
>>> +  case X86::OR32rm:    case X86::OR16rm:   case X86::OR8rm:
>>>
>>
>> Hi Manman,
>>
>> Does it make sense to add the DEC/INC variants to isDefConvertible too?
>> Or, is that not safe?  I just noticed that if tests weren't optimized
>> earlier
>> (e.g., in X86ISelLowering::EmitTest()), then you could often end up with
>> DEC/INCs at this point.
>>
>> I think it makes sense and is safe to add "INC/DEC" here.
>>
>> Thanks,
>> Manman
>>
>>
>> - Jan
>>
>>
>>> +    return true;
>>> +  }
>>> +}
>>> +
>>>  /// optimizeCompareInstr - Check if there exists an earlier instruction
>>> that
>>>  /// operates on the same source operands and sets flags in the same way
>>> as
>>>  /// Compare; remove Compare if possible.
>>> @@ -3107,6 +3152,13 @@
>>>    // CmpInstr is the first instruction of the BB.
>>>    MachineBasicBlock::iterator I = CmpInstr, Def = MI;
>>>
>>> +  // If we are comparing against zero, check whether we can use MI to
>>> update
>>> +  // EFLAGS. If MI is not in the same BB as CmpInstr, do not optimize.
>>> +  bool IsCmpZero = (SrcReg2 == 0 && CmpValue == 0);
>>> +  if (IsCmpZero && (MI->getParent() != CmpInstr->getParent() ||
>>> +      !isDefConvertible(MI)))
>>> +    return false;
>>> +
>>>    // We are searching for an earlier instruction that can make CmpInstr
>>>    // redundant and that instruction will be saved in Sub.
>>>    MachineInstr *Sub = NULL;
>>> @@ -3126,7 +3178,8 @@
>>>    for (; RI != RE; ++RI) {
>>>      MachineInstr *Instr = &*RI;
>>>      // Check whether CmpInstr can be made redundant by the current
>>> instruction.
>>> -    if (isRedundantFlagInstr(CmpInstr, SrcReg, SrcReg2, CmpValue,
>>> Instr)) {
>>> +    if (!IsCmpZero &&
>>> +        isRedundantFlagInstr(CmpInstr, SrcReg, SrcReg2, CmpValue,
>>> Instr)) {
>>>        Sub = Instr;
>>>        break;
>>>      }
>>> @@ -3153,7 +3206,7 @@
>>>    }
>>>
>>>    // Return false if no candidates exist.
>>> -  if (!Sub)
>>> +  if (!IsCmpZero && !Sub)
>>>      return false;
>>>
>>>    bool IsSwapped = (SrcReg2 != 0 && Sub->getOperand(1).getReg() ==
>>> SrcReg2 &&
>>> @@ -3177,13 +3230,10 @@
>>>        continue;
>>>
>>>      // EFLAGS is used by this instruction.
>>> -    if (IsSwapped) {
>>> -      // If we have SUB(r1, r2) and CMP(r2, r1), the condition code
>>> needs
>>> -      // to be changed from r2 > r1 to r1 < r2, from r2 < r1 to r1 >
>>> r2, etc.
>>> -      // We decode the condition code from opcode, swap the condition
>>> code,
>>> -      // and synthesize the new opcode.
>>> -      bool OpcIsSET = false;
>>> -      X86::CondCode OldCC;
>>> +    X86::CondCode OldCC;
>>> +    bool OpcIsSET = false;
>>> +    if (IsCmpZero || IsSwapped) {
>>> +      // We decode the condition code from opcode.
>>>        if (Instr.isBranch())
>>>          OldCC = getCondFromBranchOpc(Instr.getOpcode());
>>>        else {
>>> @@ -3194,6 +3244,22 @@
>>>            OldCC = getCondFromCMovOpc(Instr.getOpcode());
>>>        }
>>>        if (OldCC == X86::COND_INVALID) return false;
>>> +    }
>>> +    if (IsCmpZero) {
>>> +      switch (OldCC) {
>>> +      default: break;
>>> +      case X86::COND_A: case X86::COND_AE:
>>> +      case X86::COND_B: case X86::COND_BE:
>>> +      case X86::COND_G: case X86::COND_GE:
>>> +      case X86::COND_L: case X86::COND_LE:
>>> +      case X86::COND_O: case X86::COND_NO:
>>> +        // CF and OF are used, we can't perform this optimization.
>>> +        return false;
>>> +      }
>>> +    } else if (IsSwapped) {
>>> +      // If we have SUB(r1, r2) and CMP(r2, r1), the condition code
>>> needs
>>> +      // to be changed from r2 > r1 to r1 < r2, from r2 < r1 to r1 >
>>> r2, etc.
>>> +      // We swap the condition code and synthesize the new opcode.
>>>        X86::CondCode NewCC = getSwappedCondition(OldCC);
>>>        if (NewCC == X86::COND_INVALID) return false;
>>>
>>> @@ -3223,7 +3289,7 @@
>>>
>>>    // If EFLAGS is not killed nor re-defined, we should check whether it
>>> is
>>>    // live-out. If it is live-out, do not optimize.
>>> -  if (IsSwapped && !IsSafe) {
>>> +  if ((IsCmpZero || IsSwapped) && !IsSafe) {
>>>      MachineBasicBlock *MBB = CmpInstr->getParent();
>>>      for (MachineBasicBlock::succ_iterator SI = MBB->succ_begin(),
>>>               SE = MBB->succ_end(); SI != SE; ++SI)
>>> @@ -3231,6 +3297,8 @@
>>>          return false;
>>>    }
>>>
>>> +  // The instruction to be updated is either Sub or MI.
>>> +  Sub = IsCmpZero ? MI : Sub;
>>>    // Move Movr0Inst to the place right before Sub.
>>>    if (Movr0Inst) {
>>>      Sub->getParent()->remove(Movr0Inst);
>>> @@ -3238,10 +3306,11 @@
>>>    }
>>>
>>>    // Make sure Sub instruction defines EFLAGS.
>>> -  assert(Sub->getNumOperands() >= 4 && Sub->getOperand(3).isReg() &&
>>> -         Sub->getOperand(3).getReg() == X86::EFLAGS &&
>>> -         "EFLAGS should be the 4th operand of SUBrr or SUBri.");
>>> -  Sub->getOperand(3).setIsDef(true);
>>> +  assert(Sub->getNumOperands() >= 2 &&
>>> +         Sub->getOperand(Sub->getNumOperands()-1).isReg() &&
>>> +         Sub->getOperand(Sub->getNumOperands()-1).getReg() ==
>>> X86::EFLAGS &&
>>> +         "EFLAGS should be the last operand of SUB, ADD, OR, XOR, AND");
>>> +  Sub->getOperand(Sub->getNumOperands()-1).setIsDef(true);
>>>    CmpInstr->eraseFromParent();
>>>
>>>    // Modify the condition code of instructions in OpsToUpdate.
>>>
>>> Modified: llvm/trunk/test/CodeGen/X86/jump_sign.ll
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/jump_sign.ll?rev=160454&r1=160453&r2=160454&view=diff
>>>
>>> ==============================================================================
>>> --- llvm/trunk/test/CodeGen/X86/jump_sign.ll (original)
>>> +++ llvm/trunk/test/CodeGen/X86/jump_sign.ll Wed Jul 18 16:40:01 2012
>>> @@ -202,3 +202,14 @@
>>>  if.else.i104:                                     ; preds = %if.then44
>>>    ret void
>>>  }
>>> +; rdar://11855129
>>> +define i32 @p(i32 %a, i32 %b) nounwind {
>>> +entry:
>>> +; CHECK: p:
>>> +; CHECK-NOT: test
>>> +; CHECK: cmovs
>>> +  %add = add nsw i32 %b, %a
>>> +  %cmp = icmp sgt i32 %add, 0
>>> +  %add. = select i1 %cmp, i32 %add, i32 0
>>> +  ret i32 %add.
>>> +}
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>
>>
>>
> <cmp_dec_inc.patch>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120917/6f4021fe/attachment.html>


More information about the llvm-commits mailing list