[llvm] r181937 - X86: Remove redundant test instructions

Manman Ren mren at apple.com
Fri May 17 16:52:21 PDT 2013


LGTM,

Thanks,
Manman

On May 17, 2013, at 4:02 PM, David Majnemer <david.majnemer at gmail.com> wrote:

> Sorry, the comment refers to an approach that I abandoned. Merely looking for an instruction with a def that didn't also have a use was good enough. I'll update the comment.
> 
> The attached patch deals with the comment and changes the test so that it checks the generated code.
> 
> 
> On Fri, May 17, 2013 at 10:57 AM, Manman Ren <mren at apple.com> wrote:
> 
> Comments about the patch:
> +    // 1. We find a def that's dead. Insert after it.
> +    // 2. We find a def that doesn't use the current EFLAGS. Insert before it.
> +    Def = Sub;
> +    MachineBasicBlock::reverse_iterator
> +      InsertI = MachineBasicBlock::reverse_iterator(++Def),
> +                InsertE = Sub->getParent()->rend();
> +    for (; InsertI != InsertE; ++InsertI) {
> +      MachineInstr *Instr = &*InsertI;
> +      if (!Instr->readsRegister(X86::EFLAGS, TRI) &&
> +          Instr->modifiesRegister(X86::EFLAGS, TRI)) {
> I don't see code corresponding to the first case (find a def that's dead. insert after it)
> 
> For the testing case, can we still check the generated code to make sure peephole is doing the right thing?
> 
> Thanks,
> Manman
> 
> On May 16, 2013, at 7:03 PM, David Majnemer wrote:
> 
>> We can but this would prevent us from removing a test instruction if a cmov was choosing between an adc instruction and 0.
>> 
>> 
>> On Thu, May 16, 2013 at 6:03 PM, Manman Ren <mren at apple.com> wrote:
>> 
>> Hi David,
>> 
>> Can we return false earlier if the definition MI uses EFLAGS, inside the following loop?
>> 
>> Thanks,
>> Manman
>> 
>>   for (; RI != RE; ++RI) {
>>     MachineInstr *Instr = &*RI;
>>     // Check whether CmpInstr can be made redundant by the current instruction.
>>     if (!IsCmpZero &&
>>         isRedundantFlagInstr(CmpInstr, SrcReg, SrcReg2, CmpValue, Instr)) {
>>       Sub = Instr;
>>       break;
>>     }
>> 
>>     if (Instr->modifiesRegister(X86::EFLAGS, TRI) ||
>>         Instr->readsRegister(X86::EFLAGS, TRI)) {
>>       // This instruction modifies or uses EFLAGS.
>> 
>>       // MOV32r0 etc. are implemented with xor which clobbers condition code.
>>       // They are safe to move up, if the definition to EFLAGS is dead and
>>       // earlier instructions do not read or write EFLAGS.
>>       if (!Movr0Inst && (Instr->getOpcode() == X86::MOV8r0 ||
>>            Instr->getOpcode() == X86::MOV16r0 ||
>>            Instr->getOpcode() == X86::MOV32r0 ||
>>            Instr->getOpcode() == X86::MOV64r0) &&
>>           Instr->registerDefIsDead(X86::EFLAGS, TRI)) {
>> --> check whether MI uses EFLAGS
>>         Movr0Inst = Instr;
>>         continue;
>>       }
>> 
>>       // We can't remove CmpInstr.
>>       return false;
>>     }
>>   }
>> 
>> On May 16, 2013, at 5:17 PM, David Majnemer wrote:
>> 
>>> Attached is a patch to address PR16031.
>>> 
>>> The peephole tries to reorder MOV32r0 instructions so that they are before the instruction that modifies EFLAGS.
>>> The problem is that the peephole does not consider the case where the instruction that modifies EFLAGS also depends on them.
>>> 
>>> This patch walks backwards until it finds an instruction that has a def but no use and inserts right before it.
>>> If it cannot find such an instruction, it skips the optimization.
>>> 
>>> 
>>> On Thu, May 16, 2013 at 12:03 PM, Manman Ren <mren at apple.com> wrote:
>>> 
>>> This is related to PR16031.
>>> 
>>> Please take a look,
>>> Thanks,
>>> Manman
>>> 
>>> On May 15, 2013, at 3:45 PM, Pete Cooper wrote:
>>> 
>>>> Hi David
>>>> 
>>>> This test is breaking a buildbot.  Could you please take a look?
>>>> 
>>>> http://lab.llvm.org:8013/builders/clang-x86_64-darwin11-nobootstrap-RAincremental/builds/1779
>>>> 
>>>> Thanks,
>>>> Pete
>>>> On May 15, 2013, at 3:03 PM, David Majnemer <david.majnemer at gmail.com> wrote:
>>>> 
>>>>> Author: majnemer
>>>>> Date: Wed May 15 17:03:08 2013
>>>>> New Revision: 181937
>>>>> 
>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=181937&view=rev
>>>>> Log:
>>>>> X86: Remove redundant test instructions
>>>>> 
>>>>> Increase the number of instructions LLVM recognizes as setting the ZF
>>>>> flag. This allows us to remove test instructions that redundantly
>>>>> recalculate the flag.
>>>>> 
>>>>> Added:
>>>>>    llvm/trunk/test/CodeGen/X86/peep-test-4.ll
>>>>> Modified:
>>>>>    llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
>>>>> 
>>>>> Modified: llvm/trunk/lib/Target/X86/X86InstrInfo.cpp
>>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrInfo.cpp?rev=181937&r1=181936&r2=181937&view=diff
>>>>> ==============================================================================
>>>>> --- llvm/trunk/lib/Target/X86/X86InstrInfo.cpp (original)
>>>>> +++ llvm/trunk/lib/Target/X86/X86InstrInfo.cpp Wed May 15 17:03:08 2013
>>>>> @@ -3200,8 +3200,37 @@ inline static bool isDefConvertible(Mach
>>>>>   case X86::OR8ri:     case X86::OR64rr:   case X86::OR32rr:
>>>>>   case X86::OR16rr:    case X86::OR8rr:    case X86::OR64rm:
>>>>>   case X86::OR32rm:    case X86::OR16rm:   case X86::OR8rm:
>>>>> +  case X86::NEG8r:     case X86::NEG16r:   case X86::NEG32r: case X86::NEG64r:
>>>>> +  case X86::SAR8r1:    case X86::SAR16r1:  case X86::SAR32r1:case X86::SAR64r1:
>>>>> +  case X86::SHR8r1:    case X86::SHR16r1:  case X86::SHR32r1:case X86::SHR64r1:
>>>>> +  case X86::SHL8r1:    case X86::SHL16r1:  case X86::SHL32r1:case X86::SHL64r1:
>>>>> +  case X86::ADC32ri:   case X86::ADC32ri8:
>>>>> +  case X86::ADC32rr:   case X86::ADC64ri32:
>>>>> +  case X86::ADC64ri8:  case X86::ADC64rr:
>>>>> +  case X86::SBB32ri:   case X86::SBB32ri8:
>>>>> +  case X86::SBB32rr:   case X86::SBB64ri32:
>>>>> +  case X86::SBB64ri8:  case X86::SBB64rr:
>>>>>   case X86::ANDN32rr:  case X86::ANDN32rm:
>>>>>   case X86::ANDN64rr:  case X86::ANDN64rm:
>>>>> +  case X86::BEXTR32rr: case X86::BEXTR64rr:
>>>>> +  case X86::BEXTR32rm: case X86::BEXTR64rm:
>>>>> +  case X86::BLSI32rr:  case X86::BLSI32rm:
>>>>> +  case X86::BLSI64rr:  case X86::BLSI64rm:
>>>>> +  case X86::BLSMSK32rr:case X86::BLSMSK32rm:
>>>>> +  case X86::BLSMSK64rr:case X86::BLSMSK64rm:
>>>>> +  case X86::BLSR32rr:  case X86::BLSR32rm:
>>>>> +  case X86::BLSR64rr:  case X86::BLSR64rm:
>>>>> +  case X86::BZHI32rr:  case X86::BZHI32rm:
>>>>> +  case X86::BZHI64rr:  case X86::BZHI64rm:
>>>>> +  case X86::LZCNT16rr: case X86::LZCNT16rm:
>>>>> +  case X86::LZCNT32rr: case X86::LZCNT32rm:
>>>>> +  case X86::LZCNT64rr: case X86::LZCNT64rm:
>>>>> +  case X86::POPCNT16rr:case X86::POPCNT16rm:
>>>>> +  case X86::POPCNT32rr:case X86::POPCNT32rm:
>>>>> +  case X86::POPCNT64rr:case X86::POPCNT64rm:
>>>>> +  case X86::TZCNT16rr: case X86::TZCNT16rm:
>>>>> +  case X86::TZCNT32rr: case X86::TZCNT32rm:
>>>>> +  case X86::TZCNT64rr: case X86::TZCNT64rm:
>>>>>     return true;
>>>>>   }
>>>>> }
>>>>> @@ -3427,13 +3456,16 @@ optimizeCompareInstr(MachineInstr *CmpIn
>>>>>   }
>>>>> 
>>>>>   // Make sure Sub instruction defines EFLAGS and mark the def live.
>>>>> -  unsigned LastOperand = Sub->getNumOperands() - 1;
>>>>> -  assert(Sub->getNumOperands() >= 2 &&
>>>>> -         Sub->getOperand(LastOperand).isReg() &&
>>>>> -         Sub->getOperand(LastOperand).getReg() == X86::EFLAGS &&
>>>>> -         "EFLAGS should be the last operand of SUB, ADD, OR, XOR, AND");
>>>>> -  Sub->getOperand(LastOperand).setIsDef(true);
>>>>> -  Sub->getOperand(LastOperand).setIsDead(false);
>>>>> +  unsigned i = 0, e = Sub->getNumOperands();
>>>>> +  for (; i != e; ++i) {
>>>>> +    MachineOperand &MO = Sub->getOperand(i);
>>>>> +    if (MO.isReg() && MO.isDef() && MO.getReg() == X86::EFLAGS) {
>>>>> +      MO.setIsDead(false);
>>>>> +      break;
>>>>> +    }
>>>>> +  }
>>>>> +  assert(i != e && "Unable to locate a def EFLAGS operand");
>>>>> +
>>>>>   CmpInstr->eraseFromParent();
>>>>> 
>>>>>   // Modify the condition code of instructions in OpsToUpdate.
>>>>> 
>>>>> Added: llvm/trunk/test/CodeGen/X86/peep-test-4.ll
>>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/peep-test-4.ll?rev=181937&view=auto
>>>>> ==============================================================================
>>>>> --- llvm/trunk/test/CodeGen/X86/peep-test-4.ll (added)
>>>>> +++ llvm/trunk/test/CodeGen/X86/peep-test-4.ll Wed May 15 17:03:08 2013
>>>>> @@ -0,0 +1,154 @@
>>>>> +; RUN: llc < %s -march=x86-64 -mattr=+bmi,+bmi2,+popcnt | FileCheck %s
>>>>> +declare void @foo(i32)
>>>>> +
>>>>> +; CHECK: neg:
>>>>> +; CHECK: negl %edi
>>>>> +; CHECK-NEXT: je
>>>>> +; CHECK: jmp foo
>>>>> +; CHECK: ret
>>>>> +define void @neg(i32 %x) nounwind {
>>>>> +  %sub = sub i32 0, %x
>>>>> +  %cmp = icmp eq i32 %sub, 0
>>>>> +  br i1 %cmp, label %return, label %bb
>>>>> +
>>>>> +bb:
>>>>> +  tail call void @foo(i32 %sub)
>>>>> +  br label %return
>>>>> +
>>>>> +return:
>>>>> +  ret void
>>>>> +}
>>>>> +
>>>>> +; CHECK: sar:
>>>>> +; CHECK: sarl %edi
>>>>> +; CHECK-NEXT: je
>>>>> +; CHECK: jmp foo
>>>>> +; CHECK: ret
>>>>> +define void @sar(i32 %x) nounwind {
>>>>> +  %ashr = ashr i32 %x, 1
>>>>> +  %cmp = icmp eq i32 %ashr, 0
>>>>> +  br i1 %cmp, label %return, label %bb
>>>>> +
>>>>> +bb:
>>>>> +  tail call void @foo(i32 %ashr)
>>>>> +  br label %return
>>>>> +
>>>>> +return:
>>>>> +  ret void
>>>>> +}
>>>>> +
>>>>> +; CHECK: shr:
>>>>> +; CHECK: shrl %edi
>>>>> +; CHECK-NEXT: je
>>>>> +; CHECK: jmp foo
>>>>> +; CHECK: ret
>>>>> +define void @shr(i32 %x) nounwind {
>>>>> +  %ashr = lshr i32 %x, 1
>>>>> +  %cmp = icmp eq i32 %ashr, 0
>>>>> +  br i1 %cmp, label %return, label %bb
>>>>> +
>>>>> +bb:
>>>>> +  tail call void @foo(i32 %ashr)
>>>>> +  br label %return
>>>>> +
>>>>> +return:
>>>>> +  ret void
>>>>> +}
>>>>> +
>>>>> +; CHECK: shl:
>>>>> +; CHECK: addl %edi, %edi
>>>>> +; CHECK-NEXT: je
>>>>> +; CHECK: jmp foo
>>>>> +; CHECK: ret
>>>>> +define void @shl(i32 %x) nounwind {
>>>>> +  %shl = shl i32 %x, 1
>>>>> +  %cmp = icmp eq i32 %shl, 0
>>>>> +  br i1 %cmp, label %return, label %bb
>>>>> +
>>>>> +bb:
>>>>> +  tail call void @foo(i32 %shl)
>>>>> +  br label %return
>>>>> +
>>>>> +return:
>>>>> +  ret void
>>>>> +}
>>>>> +
>>>>> +; CHECK: adc:
>>>>> +; CHECK: movabsq $-9223372036854775808, %rax
>>>>> +; CHECK-NEXT: addq  %rdi, %rax
>>>>> +; CHECK-NEXT: adcq  $0, %rsi
>>>>> +; CHECK-NEXT: sete  %al
>>>>> +; CHECK-NEXT: ret
>>>>> +define zeroext i1 @adc(i128 %x) nounwind {
>>>>> +  %add = add i128 %x, 9223372036854775808
>>>>> +  %cmp = icmp ult i128 %add, 18446744073709551616
>>>>> +  ret i1 %cmp
>>>>> +}
>>>>> +
>>>>> +; CHECK: sbb:
>>>>> +; CHECK: cmpq  %rdx, %rdi
>>>>> +; CHECK-NEXT: sbbq  %rcx, %rsi
>>>>> +; CHECK-NEXT: setns %al
>>>>> +; CHECK-NEXT: ret
>>>>> +define zeroext i1 @sbb(i128 %x, i128 %y) nounwind {
>>>>> +  %sub = sub i128 %x, %y
>>>>> +  %cmp = icmp sge i128 %sub, 0
>>>>> +  ret i1 %cmp
>>>>> +}
>>>>> +
>>>>> +; CHECK: andn:
>>>>> +; CHECK: andnl   %esi, %edi, %edi
>>>>> +; CHECK-NEXT: je
>>>>> +; CHECK: jmp foo
>>>>> +; CHECK: ret
>>>>> +define void @andn(i32 %x, i32 %y) nounwind {
>>>>> +  %not = xor i32 %x, -1
>>>>> +  %andn = and i32 %y, %not
>>>>> +  %cmp = icmp eq i32 %andn, 0
>>>>> +  br i1 %cmp, label %return, label %bb
>>>>> +
>>>>> +bb:
>>>>> +  tail call void @foo(i32 %andn)
>>>>> +  br label %return
>>>>> +
>>>>> +return:
>>>>> +  ret void
>>>>> +}
>>>>> +
>>>>> +; CHECK: bextr:
>>>>> +; CHECK: bextrl   %esi, %edi, %edi
>>>>> +; CHECK-NEXT: je
>>>>> +; CHECK: jmp foo
>>>>> +; CHECK: ret
>>>>> +declare i32 @llvm.x86.bmi.bextr.32(i32, i32) nounwind readnone
>>>>> +define void @bextr(i32 %x, i32 %y) nounwind {
>>>>> +  %bextr = tail call i32 @llvm.x86.bmi.bextr.32(i32 %x, i32 %y)
>>>>> +  %cmp = icmp eq i32 %bextr, 0
>>>>> +  br i1 %cmp, label %return, label %bb
>>>>> +
>>>>> +bb:
>>>>> +  tail call void @foo(i32 %bextr)
>>>>> +  br label %return
>>>>> +
>>>>> +return:
>>>>> +  ret void
>>>>> +}
>>>>> +
>>>>> +; CHECK: popcnt:
>>>>> +; CHECK: popcntl
>>>>> +; CHECK-NEXT: je
>>>>> +; CHECK: jmp foo
>>>>> +; CHECK: ret
>>>>> +declare i32 @llvm.ctpop.i32(i32) nounwind readnone
>>>>> +define void @popcnt(i32 %x) nounwind {
>>>>> +  %popcnt = tail call i32 @llvm.ctpop.i32(i32 %x)
>>>>> +  %cmp = icmp eq i32 %popcnt, 0
>>>>> +  br i1 %cmp, label %return, label %bb
>>>>> +;
>>>>> +bb:
>>>>> +  tail call void @foo(i32 %popcnt)
>>>>> +  br label %return
>>>>> +;
>>>>> +return:
>>>>> +  ret void
>>>>> +}
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>> 
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>> 
>>> 
>>> <pr16031.diff>
>> 
>> 
> 
> 
> <pr16031-2.diff>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130517/2525bc01/attachment.html>


More information about the llvm-commits mailing list