[llvm-dev] RFC: [X86] Introducing command line options to prefer narrower vector instructions even when wider instructions are available
Hal Finkel via llvm-dev
llvm-dev at lists.llvm.org
Wed Nov 1 17:32:22 PDT 2017
On 11/01/2017 06:35 PM, Craig Topper via llvm-dev wrote:
> Hello all,
> I would like to propose adding the -mprefer-avx256 and -mprefer-avx128
> command line flags supported by latest GCC to clang. These flags will
> be used to limit the vector register size presented by TTI to the
> vectorizers. The backend will still be able to use wider registers for
> code written using the instrinsics in x86intrin.h. And the backend
> will still be able to use AVX512VL instructions and the additional
> XMM16-31 and YMM16-31 registers.
> -Using 512-bit operations on some Intel CPUs may cause a decrease in
> CPU frequency that may offset the gains from using the wider register
> size. See section 15.26 of Intel® 64 and IA-32 Architectures
> Optimization Reference Manual published October 2017.
I'd certainly like to see these options (especially for this reason).
> -The vector ALUs on ports 0 and 1 of the Skylake Server
> microarchitecture are only 256-bits wide. 512-bit instructions using
> these ALUs must use both ports. See section 2.1 of Intel® 64 and IA-32
> Architectures Optimization Reference Manual published October 2017.
> Implementation Plan:
> -Add prefer-avx256 and prefer-avx128 as SubtargetFeatures in X86.td
> not mapped to any CPU.
> -Add mprefer-avx256 and mprefer-avx128 and the corresponding
> -mno-prefer-avx128/256 options to clang's driver Options.td file. I
> believe this will allow clang to pass these straight through to the
> -target-feature attribute in IR.
> -Modify X86TTIImpl::getRegisterBitWidth to only return 512 if AVX512
> is enabled and prefer-avx256 and prefer-avx128 is not set. Similarly
> return 256 if AVX is enabled and prefer-avx128 is not set.
> There may be some other backend changes needed, but I plan to address
> those as we find them.
> At a later point, consider making -mprefer-avx256 the default for
> Skylake Server due to the above mentioned performance considerations.
> Does this sound reasonable?
> *Latest Intel Optimization manual available here:
> -Craig Topper
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev