<div dir="ltr"><div><div>I want to focus on just the optimizer/backend part.<br><br>Why make this an x86-specific "feature" of the target? We already have options like this in LoopVectorize.cpp:<br><br>static cl::opt<unsigned> ForceTargetNumVectorRegs(<br>    "force-target-num-vector-regs", cl::init(0), cl::Hidden,<br>    cl::desc("A flag that overrides the target's number of vector registers."));<br><br></div>Can we add an equivalent target-independent override for vector width? Any target with >1 potential register width will benefit from having this option for experimentation.</div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Nov 2, 2017 at 4:44 PM, Craig Topper via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Reviews of the initial plumbing have been posted<div><br></div><div><a href="https://reviews.llvm.org/D39575" target="_blank">https://reviews.llvm.org/<wbr>D39575</a><br></div><div><a href="https://reviews.llvm.org/D39576" target="_blank">https://reviews.llvm.org/<wbr>D39576</a><br></div></div><div class="gmail_extra"><br clear="all"><div><div class="m_-1801520405651048902gmail_signature" data-smartmail="gmail_signature">~Craig</div></div>

<br><div class="gmail_quote">On Thu, Nov 2, 2017 at 4:57 AM, Tobias Grosser <span dir="ltr"><<a href="mailto:tobias.grosser@inf.ethz.ch" target="_blank">tobias.grosser@inf.ethz.ch</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Craig,<br>

<br>

this sounds like a good idea.<br>

<br>

Best,<br>

Tobias<br>

<div class="m_-1801520405651048902HOEnZb"><div class="m_-1801520405651048902h5"><br>

On Thu, Nov 2, 2017, at 00:35, Craig Topper via llvm-dev wrote:<br>

> Hello all,<br>

><br>

><br>

><br>

> I would like to propose adding the -mprefer-avx256 and -mprefer-avx128<br>

> command line flags supported by latest GCC to clang. These flags will be<br>

> used to limit the vector register size presented by TTI to the<br>

> vectorizers.<br>

> The backend will still be able to use wider registers for code written<br>

> using the instrinsics in x86intrin.h. And the backend will still be able<br>

> to<br>

> use AVX512VL instructions and the additional XMM16-31 and YMM16-31<br>

> registers.<br>

><br>

><br>

><br>

> Motivation:<br>

><br>

> -Using 512-bit operations on some Intel CPUs may cause a decrease in CPU<br>

> frequency that may offset the gains from using the wider register size.<br>

> See<br>

> section 15.26 of Intel® 64 and IA-32 Architectures Optimization Reference<br>

> Manual published October 2017.<br>

><br>

> -The vector ALUs on ports 0 and 1 of the Skylake Server microarchitecture<br>

> are only 256-bits wide. 512-bit instructions using these ALUs must use<br>

> both<br>

> ports. See section 2.1 of Intel® 64 and IA-32 Architectures Optimization<br>

> Reference Manual published October 2017.<br>

><br>

><br>

><br>

> Implementation Plan:<br>

><br>

> -Add prefer-avx256 and prefer-avx128 as SubtargetFeatures in X86.td not<br>

> mapped to any CPU.<br>

><br>

> -Add mprefer-avx256 and mprefer-avx128 and the corresponding<br>

> -mno-prefer-avx128/256 options to clang's driver Options.td file. I<br>

> believe<br>

> this will allow clang to pass these straight through to the<br>

> -target-feature<br>

> attribute in IR.<br>

><br>

> -Modify X86TTIImpl::getRegisterBitWidt<wbr>h to only return 512 if AVX512 is<br>

> enabled and prefer-avx256 and prefer-avx128 is not set. Similarly return<br>

> 256 if AVX is enabled and prefer-avx128 is not set.<br>

><br>

><br>

><br>

> There may be some other backend changes needed, but I plan to address<br>

> those<br>

> as we find them.<br>

><br>

><br>

> At a later point, consider making -mprefer-avx256 the default for Skylake<br>

> Server due to the above mentioned performance considerations.<br>

><br>

><br>

><br>

> Does this sound reasonable?<br>

><br>

><br>

><br>

> *Latest Intel Optimization manual available here:<br>

> <a href="https://software.intel.com/en-us/articles/intel-sdm#optimization" rel="noreferrer" target="_blank">https://software.intel.com/en-<wbr>us/articles/intel-sdm#optimiza<wbr>tion</a><br>

><br>

><br>

> -Craig Topper<br>

</div></div><div class="m_-1801520405651048902HOEnZb"><div class="m_-1801520405651048902h5">> ______________________________<wbr>_________________<br>

> LLVM Developers mailing list<br>

> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

</div></div></blockquote></div><br></div>

<br>______________________________<wbr>_________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

<br></blockquote></div><br></div>