[llvm] r259615 - Revert r259576: Disable the vzeroupper insertion pass on PS4.

Yunzhong Gao via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 2 17:25:13 PST 2016


Author: ygao
Date: Tue Feb  2 19:25:12 2016
New Revision: 259615

URL: http://llvm.org/viewvc/llvm-project?rev=259615&view=rev
Log:
Revert r259576: Disable the vzeroupper insertion pass on PS4.
Will re-implement based on review feedback.


Modified:
    llvm/trunk/lib/Target/X86/X86TargetMachine.cpp
    llvm/trunk/test/CodeGen/X86/avx-vzeroupper.ll

Modified: llvm/trunk/lib/Target/X86/X86TargetMachine.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86TargetMachine.cpp?rev=259615&r1=259614&r2=259615&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86TargetMachine.cpp (original)
+++ llvm/trunk/lib/Target/X86/X86TargetMachine.cpp Tue Feb  2 19:25:12 2016
@@ -270,9 +270,6 @@ void X86PassConfig::addPreEmitPass() {
   if (getOptLevel() != CodeGenOpt::None)
     addPass(createExecutionDependencyFixPass(&X86::VR128RegClass));
 
-  if (TM->getTargetTriple().isPS4CPU())
-    UseVZeroUpper = false;
-
   if (UseVZeroUpper)
     addPass(createX86IssueVZeroUpperPass());
 

Modified: llvm/trunk/test/CodeGen/X86/avx-vzeroupper.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx-vzeroupper.ll?rev=259615&r1=259614&r2=259615&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/avx-vzeroupper.ll (original)
+++ llvm/trunk/test/CodeGen/X86/avx-vzeroupper.ll Tue Feb  2 19:25:12 2016
@@ -1,13 +1,4 @@
 ; RUN: llc < %s -x86-use-vzeroupper -mtriple=x86_64-apple-darwin -mattr=+avx | FileCheck %s
-; RUN: llc < %s -mtriple=x86_64-scei-ps4 -mattr=+avx | FileCheck --check-prefix=PS4 %s
-
-; The Jaguar (AMD Family 16h) cores in the PS4 don't benefit from vzeroupper.
-; At most, the benefit is "garbage collecting" def'd upper parts of the ymm
-; registers, but the core has so many FP phys regs that this benefit of freeing
-; up the upper parts is for now not worth it. Unlike Intel, there is no
-; performance hazard to def'ing the lower parts of a ymm without clearing the
-; upper part.
-; PS4-NOT: vzeroupper
 
 declare i32 @foo()
 declare <4 x float> @do_sse(<4 x float>)




More information about the llvm-commits mailing list