r371694 - [X86] Enable -mprefer-vector-width=256 by default for Skylake-avx512 and later Intel CPUs.
Craig Topper via cfe-commits
cfe-commits at lists.llvm.org
Wed Sep 11 16:54:36 PDT 2019
Author: ctopper
Date: Wed Sep 11 16:54:36 2019
New Revision: 371694
URL: http://llvm.org/viewvc/llvm-project?rev=371694&view=rev
Log:
[X86] Enable -mprefer-vector-width=256 by default for Skylake-avx512 and later Intel CPUs.
AVX512 instructions can cause a frequency drop on these CPUs. This
can negate the performance gains from using wider vectors. Enabling
prefer-vector-width=256 will prevent generation of zmm registers
unless explicit 512 bit operations are used in the original source
code.
I believe gcc and icc both do something similar to this by default.
Differential Revision: https://reviews.llvm.org/D67259
Modified:
cfe/trunk/docs/ReleaseNotes.rst
Modified: cfe/trunk/docs/ReleaseNotes.rst
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/ReleaseNotes.rst?rev=371694&r1=371693&r2=371694&view=diff
==============================================================================
--- cfe/trunk/docs/ReleaseNotes.rst (original)
+++ cfe/trunk/docs/ReleaseNotes.rst Wed Sep 11 16:54:36 2019
@@ -56,8 +56,12 @@ Improvements to Clang's diagnostics
Non-comprehensive list of changes in this release
-------------------------------------------------
-- ...
-
+- For X86 target, -march=skylake-avx512, -march=icelake-client,
+ -march=icelake-server, -march=cascadelake, -march=cooperlake will default to
+ not using 512-bit zmm registers in vectorized code unless 512-bit intrinsics
+ are used in the source code. 512-bit operations are known to cause the CPUs
+ to run at a lower frequency which can impact performance. This behavior can be
+ changed by passing -mprefer-vector-width=512 on the command line.
New Compiler Flags
------------------
More information about the cfe-commits
mailing list