[llvm] r259576 - Disable the vzeroupper insertion pass on PS4.

Sean Silva via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 2 19:04:10 PST 2016


We also discussed that when upstreaming, we should probably also do this
for any btver2, instead of restricting to PS4.

-- Sean Silva

On Tue, Feb 2, 2016 at 1:39 PM, Yunzhong Gao via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

> Author: ygao
> Date: Tue Feb  2 15:39:23 2016
> New Revision: 259576
>
> URL: http://llvm.org/viewvc/llvm-project?rev=259576&view=rev
> Log:
> Disable the vzeroupper insertion pass on PS4.
> See comments in test/CodeGen/X86/avx-vzeroupper.ll for more explanation.
>
> Original patch by: Sean Silva
>
>
> Modified:
>     llvm/trunk/lib/Target/X86/X86TargetMachine.cpp
>     llvm/trunk/test/CodeGen/X86/avx-vzeroupper.ll
>
> Modified: llvm/trunk/lib/Target/X86/X86TargetMachine.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetMachine.cpp?rev=259576&r1=259575&r2=259576&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86TargetMachine.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86TargetMachine.cpp Tue Feb  2 15:39:23 2016
> @@ -270,6 +270,9 @@ void X86PassConfig::addPreEmitPass() {
>    if (getOptLevel() != CodeGenOpt::None)
>      addPass(createExecutionDependencyFixPass(&X86::VR128RegClass));
>
> +  if (TM->getTargetTriple().isPS4CPU())
> +    UseVZeroUpper = false;
> +
>    if (UseVZeroUpper)
>      addPass(createX86IssueVZeroUpperPass());
>
>
> Modified: llvm/trunk/test/CodeGen/X86/avx-vzeroupper.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-vzeroupper.ll?rev=259576&r1=259575&r2=259576&view=diff
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/avx-vzeroupper.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/avx-vzeroupper.ll Tue Feb  2 15:39:23 2016
> @@ -1,4 +1,13 @@
>  ; RUN: llc < %s -x86-use-vzeroupper -mtriple=x86_64-apple-darwin
> -mattr=+avx | FileCheck %s
> +; RUN: llc < %s -mtriple=x86_64-scei-ps4 -mattr=+avx | FileCheck
> --check-prefix=PS4 %s
> +
> +; The Jaguar (AMD Family 16h) cores in the PS4 don't benefit from
> vzeroupper.
> +; At most, the benefit is "garbage collecting" def'd upper parts of the
> ymm
> +; registers, but the core has so many FP phys regs that this benefit of
> freeing
> +; up the upper parts is for now not worth it. Unlike Intel, there is no
> +; performance hazard to def'ing the lower parts of a ymm without clearing
> the
> +; upper part.
> +; PS4-NOT: vzeroupper
>
>  declare i32 @foo()
>  declare <4 x float> @do_sse(<4 x float>)
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160202/5b1af1a1/attachment.html>


More information about the llvm-commits mailing list