[llvm] r192669 - Improve on r192635, ExeDepsFix for avx, and add a test case.
Andrew Trick
atrick at apple.com
Wed Oct 16 11:34:33 PDT 2013
On Oct 16, 2013, at 6:06 AM, Benjamin Kramer <benny.kra at gmail.com> wrote:
>
> On 15.10.2013, at 05:39, Andrew Trick <atrick at apple.com> wrote:
>
>> Author: atrick
>> Date: Mon Oct 14 22:39:43 2013
>> New Revision: 192669
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=192669&view=rev
>> Log:
>> Improve on r192635, ExeDepsFix for avx, and add a test case.
>>
>> rdar:15221834 False AVX register dependencies cause 5x slowdown on
>> flops-5/6 and significant slowdown on several others.
>>
>> This was blocking the switch to MI-Sched.
>>
>> Added:
>> llvm/trunk/test/CodeGen/X86/break-avx-dep.ll
>> Modified:
>> llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp
>>
>> Modified: llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp?rev=192669&r1=192668&r2=192669&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp (original)
>> +++ llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp Mon Oct 14 22:39:43 2013
>> @@ -557,6 +557,9 @@ void ExeDepsFix::processUndefReads(Machi
>>
>> for (MachineBasicBlock::reverse_iterator I = MBB->rbegin(), E = MBB->rend();
>> I != E; ++I) {
>> + // Update liveness, including the current instrucion's defs.
>> + LiveUnits.stepBackward(*I, *TRI);
>> +
>> if (UndefMI == &*I) {
>> if (!LiveUnits.contains(UndefMI->getOperand(OpIdx).getReg(), *TRI))
>> TII->breakPartialRegDependency(UndefMI, OpIdx, TRI);
>> @@ -568,7 +571,6 @@ void ExeDepsFix::processUndefReads(Machi
>> UndefMI = UndefReads.back().first;
>> OpIdx = UndefReads.back().second;
>> }
>> - LiveUnits.stepBackward(*I, *TRI);
>> }
>> }
>>
>>
>> Added: llvm/trunk/test/CodeGen/X86/break-avx-dep.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/break-avx-dep.ll?rev=192669&view=auto
>> ==============================================================================
>> --- llvm/trunk/test/CodeGen/X86/break-avx-dep.ll (added)
>> +++ llvm/trunk/test/CodeGen/X86/break-avx-dep.ll Mon Oct 14 22:39:43 2013
>> @@ -0,0 +1,29 @@
>> +; RUN: llc < %s -march=x86-64 -mattr=+avx | FileCheck %s
>> +;
>> +; rdar:15221834 False AVX register dependencies cause 5x slowdown on
>> +; flops-6. Make sure the unused register read by vcvtsi2sdq is zeroed
>> +; to avoid cyclic dependence on a write to the same register in a
>> +; previous iteration.
>> +
>> +; CHECK-LABEL: t1:
>> +; CHECK-LABEL: %loop
>> +; CHECK: vxorps %[[REG:xmm.]], %{{xmm.}}, %{{xmm.}}
>> +; CHECK: vcvtsi2sdq %{{r..}}, %[[REG]], %{{xmm.}}
>> +define i64 @t1(i64* nocapture %x, double* nocapture %y) nounwind {
>> +entry:
>> + %vx = load i64* %x
>> + br label %loop
>> +loop:
>> + %i = phi i64 [ 1, %entry ], [ %inc, %loop ]
>> + %s1 = phi i64 [ %vx, %entry ], [ %s2, %loop ]
>> + %fi = sitofp i64 %i to double
>> + %vy = load double* %y
>> + %fipy = fadd double %fi, %vy
>> + %iipy = fptosi double %fipy to i64
>> + %s2 = add i64 %s1, %iipy
>> + %inc = add nsw i64 %i, 1
>> + %exitcond = icmp eq i64 %inc, 156250000
>> + br i1 %exitcond, label %ret, label %loop
>> +ret:
>> + ret i64 %s2
>> +}
>
> I'm not exactly sure why, but this test fails on the atom buildbot. You can reproduce it with -mcpu=atom, looks like the vxorps is missing there.
Thank you. It turned out to be a minor PostRA scheduling bug. r192824.
-Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131016/b0f92811/attachment.html>
More information about the llvm-commits
mailing list