r371694 - [X86] Enable -mprefer-vector-width=256 by default for Skylake-avx512 and later Intel CPUs.

Wed Sep 11 16:54:36 PDT 2019

Author: ctopper
Date: Wed Sep 11 16:54:36 2019
New Revision: 371694

URL: http://llvm.org/viewvc/llvm-project?rev=371694&view=rev
Log:
[X86] Enable -mprefer-vector-width=256 by default for Skylake-avx512 and later Intel CPUs.

AVX512 instructions can cause a frequency drop on these CPUs. This
can negate the performance gains from using wider vectors. Enabling
prefer-vector-width=256 will prevent generation of zmm registers
unless explicit 512 bit operations are used in the original source
code.

I believe gcc and icc both do something similar to this by default.

Differential Revision: https://reviews.llvm.org/D67259

Modified:
    cfe/trunk/docs/ReleaseNotes.rst

Modified: cfe/trunk/docs/ReleaseNotes.rst
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/ReleaseNotes.rst?rev=371694&r1=371693&r2=371694&view=diff
==============================================================================

--- cfe/trunk/docs/ReleaseNotes.rst (original)
+++ cfe/trunk/docs/ReleaseNotes.rst Wed Sep 11 16:54:36 2019
@@ -56,8 +56,12 @@ Improvements to Clang's diagnostics
 Non-comprehensive list of changes in this release
 -------------------------------------------------
 
-- ...
-
+- For X86 target, -march=skylake-avx512, -march=icelake-client,
+  -march=icelake-server, -march=cascadelake, -march=cooperlake will default to
+  not using 512-bit zmm registers in vectorized code unless 512-bit intrinsics
+  are used in the source code. 512-bit operations are known to cause the CPUs
+  to run at a lower frequency which can impact performance. This behavior can be
+  changed by passing -mprefer-vector-width=512 on the command line.
 
 New Compiler Flags
 ------------------