[PATCH] D67259: [X86] Enable -mprefer-vector-width=256 by default for Skylake-avx512 and later Intel CPUs.

Wed Sep 11 16:53:01 PDT 2019

This revision was automatically updated to reflect the committed changes.
Closed by commit rL371694: [X86] Enable -mprefer-vector-width=256 by default for Skylake-avx512 and later… (authored by ctopper, committed by ).

Changed prior to commit:
  https://reviews.llvm.org/D67259?vs=219735&id=219832#toc

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67259/new/

https://reviews.llvm.org/D67259

Files:
  cfe/trunk/docs/ReleaseNotes.rst
  llvm/trunk/docs/ReleaseNotes.rst
  llvm/trunk/lib/Target/X86/X86.td
  llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll


Index: llvm/trunk/docs/ReleaseNotes.rst
===================================================================

--- llvm/trunk/docs/ReleaseNotes.rst
+++ llvm/trunk/docs/ReleaseNotes.rst
@@ -96,6 +96,10 @@
   be passed in ZMM registers for calls and returns. Previously they were passed
   in two YMM registers. Old behavior can be enabled by passing
   -x86-enable-old-knl-abi
+* -mprefer-vector-width=256 is now the default behavior skylake-avx512 and later
+  Intel CPUs. This tries to limit the use of 512-bit registers which can cause a
+  decrease in CPU frequency on these CPUs. This can be re-enabled by passing
+  -mprefer-vector-width=512 to clang or passing -mattr=-prefer-256-bit to llc.
 
 Changes to the AMDGPU Target
 -----------------------------
Index: llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll
===================================================================
--- llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll
+++ llvm/trunk/test/CodeGen/X86/min-legal-vector-width.ll
@@ -1,6 +1,14 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=fast-variable-shuffle,avx512vl,avx512bw,avx512dq,prefer-256-bit | FileCheck %s --check-prefixes=CHECK,CHECK-AVX512
 ; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=fast-variable-shuffle,avx512vl,avx512bw,avx512dq,prefer-256-bit,avx512vbmi | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI
+; Make sure CPUs default to prefer-256-bit. avx512vnni isn't interesting as it just adds an isel peephole for vpmaddwd+vpaddd
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=skylake-avx512 | FileCheck %s --check-prefixes=CHECK,CHECK-AVX512
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=cascadelake | FileCheck %s --check-prefixes=CHECK,CHECK-AVX512
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=cooperlake | FileCheck %s --check-prefixes=CHECK,CHECK-AVX512
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mcpu=cannonlake | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=icelake-client | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=icelake-server | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-avx512vnni -mcpu=tigerlake | FileCheck %s --check-prefixes=CHECK,CHECK-VBMI
 
 ; This file primarily contains tests for specific places in X86ISelLowering.cpp that needed be made aware of the legalizer not allowing 512-bit vectors due to prefer-256-bit even though AVX512 is enabled.
 
Index: llvm/trunk/lib/Target/X86/X86.td
===================================================================
--- llvm/trunk/lib/Target/X86/X86.td
+++ llvm/trunk/lib/Target/X86/X86.td
@@ -601,6 +601,7 @@
 
   // Skylake-AVX512
   list<SubtargetFeature> SKXAdditionalFeatures = [FeatureAVX512,
+                                                  FeaturePrefer256Bit,
                                                   FeatureCDI,
                                                   FeatureDQI,
                                                   FeatureBWI,
@@ -634,6 +635,7 @@
 
   // Cannonlake
   list<SubtargetFeature> CNLAdditionalFeatures = [FeatureAVX512,
+                                                  FeaturePrefer256Bit,
                                                   FeatureCDI,
                                                   FeatureDQI,
                                                   FeatureBWI,
Index: cfe/trunk/docs/ReleaseNotes.rst
===================================================================
--- cfe/trunk/docs/ReleaseNotes.rst
+++ cfe/trunk/docs/ReleaseNotes.rst
@@ -56,8 +56,12 @@
 Non-comprehensive list of changes in this release
 -------------------------------------------------
 
-- ...
-
+- For X86 target, -march=skylake-avx512, -march=icelake-client,
+  -march=icelake-server, -march=cascadelake, -march=cooperlake will default to
+  not using 512-bit zmm registers in vectorized code unless 512-bit intrinsics
+  are used in the source code. 512-bit operations are known to cause the CPUs
+  to run at a lower frequency which can impact performance. This behavior can be
+  changed by passing -mprefer-vector-width=512 on the command line.
 
 New Compiler Flags
 ------------------


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D67259.219832.patch
Type: text/x-patch
Size: 4431 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190911/6972473a/attachment.bin>