[llvm] r192669 - Improve on r192635, ExeDepsFix for avx, and add a test case.

Benjamin Kramer benny.kra at gmail.com
Wed Oct 16 06:06:44 PDT 2013


On 15.10.2013, at 05:39, Andrew Trick <atrick at apple.com> wrote:

> Author: atrick
> Date: Mon Oct 14 22:39:43 2013
> New Revision: 192669
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=192669&view=rev
> Log:
> Improve on r192635, ExeDepsFix for avx, and add a test case.
> 
> rdar:15221834 False AVX register dependencies cause 5x slowdown on
> flops-5/6 and significant slowdown on several others.
> 
> This was blocking the switch to MI-Sched.
> 
> Added:
>    llvm/trunk/test/CodeGen/X86/break-avx-dep.ll
> Modified:
>    llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp
> 
> Modified: llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp?rev=192669&r1=192668&r2=192669&view=diff
> ==============================================================================
> --- llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp (original)
> +++ llvm/trunk/lib/CodeGen/ExecutionDepsFix.cpp Mon Oct 14 22:39:43 2013
> @@ -557,6 +557,9 @@ void ExeDepsFix::processUndefReads(Machi
> 
>   for (MachineBasicBlock::reverse_iterator I = MBB->rbegin(), E = MBB->rend();
>        I != E; ++I) {
> +    // Update liveness, including the current instrucion's defs.
> +    LiveUnits.stepBackward(*I, *TRI);
> +
>     if (UndefMI == &*I) {
>       if (!LiveUnits.contains(UndefMI->getOperand(OpIdx).getReg(), *TRI))
>         TII->breakPartialRegDependency(UndefMI, OpIdx, TRI);
> @@ -568,7 +571,6 @@ void ExeDepsFix::processUndefReads(Machi
>       UndefMI = UndefReads.back().first;
>       OpIdx = UndefReads.back().second;
>     }
> -    LiveUnits.stepBackward(*I, *TRI);
>   }
> }
> 
> 
> Added: llvm/trunk/test/CodeGen/X86/break-avx-dep.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/break-avx-dep.ll?rev=192669&view=auto
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/break-avx-dep.ll (added)
> +++ llvm/trunk/test/CodeGen/X86/break-avx-dep.ll Mon Oct 14 22:39:43 2013
> @@ -0,0 +1,29 @@
> +; RUN: llc < %s -march=x86-64 -mattr=+avx | FileCheck %s
> +;
> +; rdar:15221834 False AVX register dependencies cause 5x slowdown on
> +; flops-6. Make sure the unused register read by vcvtsi2sdq is zeroed
> +; to avoid cyclic dependence on a write to the same register in a
> +; previous iteration.
> +
> +; CHECK-LABEL: t1:
> +; CHECK-LABEL: %loop
> +; CHECK: vxorps %[[REG:xmm.]], %{{xmm.}}, %{{xmm.}}
> +; CHECK: vcvtsi2sdq %{{r..}}, %[[REG]], %{{xmm.}}
> +define i64 @t1(i64* nocapture %x, double* nocapture %y) nounwind {
> +entry:
> +  %vx = load i64* %x
> +  br label %loop
> +loop:
> +  %i = phi i64 [ 1, %entry ], [ %inc, %loop ]
> +  %s1 = phi i64 [ %vx, %entry ], [ %s2, %loop ]
> +  %fi = sitofp i64 %i to double
> +  %vy = load double* %y
> +  %fipy = fadd double %fi, %vy
> +  %iipy = fptosi double %fipy to i64
> +  %s2 = add i64 %s1, %iipy
> +  %inc = add nsw i64 %i, 1
> +  %exitcond = icmp eq i64 %inc, 156250000
> +  br i1 %exitcond, label %ret, label %loop
> +ret:
> +  ret i64 %s2
> +}

I'm not exactly sure why, but this test fails on the atom buildbot. You can reproduce it with -mcpu=atom, looks like the vxorps is missing there.

- Ben





More information about the llvm-commits mailing list