[llvm] r192669 - Improve on r192635, ExeDepsFix for avx, and add a test case.

Andrew Trick atrick at apple.com
Wed Oct 16 11:34:33 PDT 2013


On Oct 16, 2013, at 6:06 AM, Benjamin Kramer <benny.kra at gmail.com> wrote:

> 
> On 15.10.2013, at 05:39, Andrew Trick <atrick at apple.com> wrote:
> 
>> Author: atrick
>> Date: Mon Oct 14 22:39:43 2013
>> New Revision: 192669
>> 
>> URL: http://llvm.org/viewvc/llvm-project?rev=192669&view=rev
>> Log:
>> Improve on r192635, ExeDepsFix for avx, and add a test case.
>> 
>> rdar:15221834 False AVX register dependencies cause 5x slowdown on
>> flops-5/6 and significant slowdown on several others.
>> 
>> This was blocking the switch to MI-Sched.
>> 
>> Added:
>>   llvm/trunk/test/CodeGen/X86/break-avx-dep.ll
>> Modified:
>>   llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp
>> 
>> Modified: llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp?rev=192669&r1=192668&r2=192669&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp (original)
>> +++ llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp Mon Oct 14 22:39:43 2013
>> @@ -557,6 +557,9 @@ void ExeDepsFix::processUndefReads(Machi
>> 
>>  for (MachineBasicBlock::reverse_iterator I = MBB->rbegin(), E = MBB->rend();
>>       I != E; ++I) {
>> +    // Update liveness, including the current instrucion's defs.
>> +    LiveUnits.stepBackward(*I, *TRI);
>> +
>>    if (UndefMI == &*I) {
>>      if (!LiveUnits.contains(UndefMI->getOperand(OpIdx).getReg(), *TRI))
>>        TII->breakPartialRegDependency(UndefMI, OpIdx, TRI);
>> @@ -568,7 +571,6 @@ void ExeDepsFix::processUndefReads(Machi
>>      UndefMI = UndefReads.back().first;
>>      OpIdx = UndefReads.back().second;
>>    }
>> -    LiveUnits.stepBackward(*I, *TRI);
>>  }
>> }
>> 
>> 
>> Added: llvm/trunk/test/CodeGen/X86/break-avx-dep.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/break-avx-dep.ll?rev=192669&view=auto
>> ==============================================================================
>> --- llvm/trunk/test/CodeGen/X86/break-avx-dep.ll (added)
>> +++ llvm/trunk/test/CodeGen/X86/break-avx-dep.ll Mon Oct 14 22:39:43 2013
>> @@ -0,0 +1,29 @@
>> +; RUN: llc < %s -march=x86-64 -mattr=+avx | FileCheck %s
>> +;
>> +; rdar:15221834 False AVX register dependencies cause 5x slowdown on
>> +; flops-6. Make sure the unused register read by vcvtsi2sdq is zeroed
>> +; to avoid cyclic dependence on a write to the same register in a
>> +; previous iteration.
>> +
>> +; CHECK-LABEL: t1:
>> +; CHECK-LABEL: %loop
>> +; CHECK: vxorps %[[REG:xmm.]], %{{xmm.}}, %{{xmm.}}
>> +; CHECK: vcvtsi2sdq %{{r..}}, %[[REG]], %{{xmm.}}
>> +define i64 @t1(i64* nocapture %x, double* nocapture %y) nounwind {
>> +entry:
>> +  %vx = load i64* %x
>> +  br label %loop
>> +loop:
>> +  %i = phi i64 [ 1, %entry ], [ %inc, %loop ]
>> +  %s1 = phi i64 [ %vx, %entry ], [ %s2, %loop ]
>> +  %fi = sitofp i64 %i to double
>> +  %vy = load double* %y
>> +  %fipy = fadd double %fi, %vy
>> +  %iipy = fptosi double %fipy to i64
>> +  %s2 = add i64 %s1, %iipy
>> +  %inc = add nsw i64 %i, 1
>> +  %exitcond = icmp eq i64 %inc, 156250000
>> +  br i1 %exitcond, label %ret, label %loop
>> +ret:
>> +  ret i64 %s2
>> +}
> 
> I'm not exactly sure why, but this test fails on the atom buildbot. You can reproduce it with -mcpu=atom, looks like the vxorps is missing there.

Thank you. It turned out to be a minor PostRA scheduling bug. r192824.

-Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131016/b0f92811/attachment.html>


More information about the llvm-commits mailing list