<div dir="ltr"><br><br><div class="gmail_quote"><div dir="ltr">On Tue, Nov 14, 2017 at 10:59 AM Craig Topper via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Erich Keane, just brought up an extra complication with clang trying to detect -march=skylake-avx512 and set this new attribute. It's not just the command line that we need to worry about. We also need to support it when arch=skylake-avx512 appears in a target function attribute. I need to see if gcc supports prefer-avx128 in the target attribute too. Cause you might want override this on a per function basis. I'm not even sure I know how command line options and target attribute interact today.</div><div class="gmail_extra"></div><div class="gmail_extra"><br clear="all"></div></blockquote><div><br></div><div>Relatedly you'll need to handle this for LTO anyhow. Any particular function could have any set of subtarget features applied. That said, as far as I know, every x86 command line option isn't supported there.</div><div><br></div><div>FWIW command line options just set target features and it works as an intersection with the function being the "last one wins".</div><div><br></div><div>-eric</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_extra"><div><div class="m_-5855807460524593739gmail_signature" data-smartmail="gmail_signature">~Craig</div></div></div><div class="gmail_extra">
<br><div class="gmail_quote">On Tue, Nov 14, 2017 at 10:10 AM, Sanjay Patel <span dir="ltr"><<a href="mailto:spatel@rotateright.com" target="_blank">spatel@rotateright.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>I haven't looked into actually implementing revectorization, so we may just want to ignore that possibility for now. <br><br>But I imagined that revectorization could hit the same problem that we're trying to avoid here: if the cost models say that wider vectors are legal and cheaper, but the reality is that perf will suffer when using those wider vectors, then we want to avoid using the wider ops. The user pref/override will be taken into account when deciding if we should go wider.<br><br></div>In either scenario, we're not actually removing or limiting vector widths, right? They're still legal as far as the ISA is concerned. We're just avoiding those ops because the programmer and/or the CPU model says we'll do better with narrower ops.<br><br></div><div class="m_-5855807460524593739HOEnZb"><div class="m_-5855807460524593739h5"><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Nov 14, 2017 at 10:26 AM, Craig Topper via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">For the re-vectorization case mentioned by Sanjay. That seems like a different type of limit than what's being proposed here. For revectorization you want to remove smaller vector widths. This is removing larger vector widths. I don't think we want the -mprefer-vector-width=256 being proposed here to say we can't do 128-bit vectors with the 256-bit. Maybe this should be called -mlimit-vector-width?<div><br></div><div>Its not clear to be why revectorization would need a preference at all? Shouldn't we be able to decide from the cost models? We go from scalar to vector today based on cost models. Why couldn't we go from vector to wider vector?</div></div><div class="gmail_extra"><br clear="all"><div><div class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750gmail_signature" data-smartmail="gmail_signature">~Craig</div></div><div><div class="m_-5855807460524593739m_1852127268239199982h5">
<br><div class="gmail_quote">On Mon, Nov 13, 2017 at 3:54 PM, Hal Finkel <span dir="ltr"><<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"><span>
<p><br>
</p>
<br>
<div class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067moz-cite-prefix">On 11/13/2017 05:49 PM, Eric
Christopher wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr"><br>
<br>
<div class="gmail_quote">
<div dir="ltr">On Mon, Nov 13, 2017 at 2:15 PM Craig Topper
via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">On Sat, Nov 11, 2017 at 8:52
PM, Hal Finkel via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"><span class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-">
<p><br>
</p>
<div class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-m_264012946301939527moz-cite-prefix">On
11/11/2017 09:52 PM, UE US via llvm-dev wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>If skylake is that bad at AVX2</div>
</div>
</blockquote>
<br>
</span> I don't think this says anything negative
about AVX2, but AVX-512.</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>Right. I think we're at AVX/AVX2 is "bad" on
Haswell/Broadwell and AVX512 is "bad" on Skylake. At least
in the "random autovectorization spread out" aspect.</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"><span class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-"><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div> it belongs in -mcpu / -march IMO. </div>
</div>
</blockquote>
<br>
</span> No. We'd still want to enable the
architectural features for vector intrinsics and
the like.</div>
</blockquote>
<div><br>
</div>
</div>
</div>
</div>
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>I took this to mean that the feature should be
enabled by default for -march=skylake-avx512.</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div><br>
</div>
<div>Agreed.</div>
</div>
</div>
</blockquote>
<br></span>
Yes. Also, GNOMETOYS clarified to me (off list) that is what he
meant.<span class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750HOEnZb"><font color="#888888"><br>
<br>
-Hal</font></span><div><div class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750h5"><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div><br>
</div>
<div>-eric</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div><br>
</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"><span class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-"><br>
<br>
<blockquote type="cite">Based on the current
performance data we're seeing, we think we
need to ultimately default skylake-avx512 to
-mprefer-vector-width=256.</blockquote>
<br>
</span> Craig, is this for both integer and
floating-point code?</div>
</blockquote>
<div><br>
</div>
</div>
</div>
</div>
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>I believe so, but I'll try to get confirmation
from the people with more data.</div>
</div>
</div>
</div>
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"><span class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-HOEnZb"><font color="#888888"><br>
<br>
-Hal <br>
</font></span>
<div>
<div class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-h5"> <br>
<blockquote type="cite">
<div dir="ltr">
<div> Most people will build for the
standard x86_64-pc-linux or whatever
anyway, and completely ignore the
change. This will mainly affect those
who build their own software and
optimize for their system, and lots
there have probably caught on to this
already. I always thought that's what
-march was made for, really. <br>
</div>
</div>
<div class="gmail_extra"><br clear="all">
<div>
<div class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-m_264012946301939527gmail_signature">GNOMETOYS<br>
</div>
</div>
<br>
<div class="gmail_quote">On Sat, Nov 11,
2017 at 10:25 AM, Sanjay Patel via
llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div>
<div>Yes - I was thinking of
FeatureFastScalarFSQRT /
FeatureFastVectorFSQRT which are
used by isFsqrtCheap(). These
were added to override the
default x86 sqrt estimate
codegen with:<br>
<a href="https://reviews.llvm.org/D21379" target="_blank">https://reviews.llvm.org/D21379</a><br>
<br>
</div>
But I'm not sure we really need
that kind of hack. Can we adjust
the attribute in clang based on
the target cpu? Ie, if you have
something like:<br>
</div>
$ clang -O2 -march=skylake-avx512
foo.c<br>
<br>
Then you can detect that in the
clang driver and pass
-mprefer-vector-width=256 to clang
codegen as an option? Clang codegen
then adds that function attribute to
everything it outputs. Then, the
vectorizers and/or backend detect
that attribute and adjust their
behavior based on it. <br>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
</div>
</div>
</div>
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>Do we have a precedent for setting a target
independent flag from a target specific cpu string
in the clang driver? Want to make sure I understand
what the processing on such a thing would look like.
Particularly to get the order right so the user can
override it.<br>
</div>
</div>
</div>
</div>
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<div>
<div class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-h5">
<blockquote type="cite">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr"> <br>
So I don't think we should be
messing with any kind of type
legality checking because that stuff
should all be correct already. We're
just choosing a vector size based on
a pref. I think we should even allow
the pref to go bigger than a legal
type. This came up somewhere on
llvm-dev or in a bug recently in the
context of vector reductions.<br>
<br>
<br>
</div>
<div class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-m_264012946301939527HOEnZb">
<div class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-m_264012946301939527h5">
<div class="gmail_extra"><br>
<div class="gmail_quote">On Fri,
Nov 10, 2017 at 6:04 PM, Craig
Topper <span dir="ltr"><<a href="mailto:craig.topper@gmail.com" target="_blank">craig.topper@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">Are you
referring to
the X86TargetLowering::isFsqrtCheap
hook?</div>
<div class="gmail_extra"><br clear="all">
<div>
<div class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-m_264012946301939527m_6454106954572217318m_771050129279988374gmail_signature">~Craig</div>
</div>
<br>
<div class="gmail_quote">On
Fri, Nov 10, 2017 at
7:39 AM, Sanjay Patel <span dir="ltr"><<a href="mailto:spatel@rotateright.com" target="_blank">spatel@rotateright.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">We can
tie a user
preference /
override to a CPU
model. We do
something like that
for square root
estimates already
(although it does
use a
SubtargetFeature
currently for x86;
ideally, we'd key
that off of
something in the CPU
scheduler model).
<div>
<div class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-m_264012946301939527m_6454106954572217318m_771050129279988374h5"><br>
<div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On
Thu, Nov 9,
2017 at 4:21
PM, Craig
Topper <span dir="ltr"><<a href="mailto:craig.topper@gmail.com" target="_blank">craig.topper@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">I
agree that a
less x86
specific
command line
makes sense.
I've been
having an
internal
discussions
with gcc folks
and their
evaluating
switching to
something like
-mprefer-vector-width=128/256/512/none
<div><br>
</div>
<div>Based on
the current
performance
data we're
seeing, we
think we need
to ultimately
default
skylake-avx512
to
-mprefer-vector-width=256.
If we go with
a target
independent
option/implementation
is there
someway we
could still
affect the
default
behavior in a
target
specific way?</div>
</div>
<div class="gmail_extra"><br clear="all">
<div>
<div class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-m_264012946301939527m_6454106954572217318m_771050129279988374m_4887027107317541871m_-9050519988835790991gmail_signature">~Craig</div>
</div>
<br>
<div class="gmail_quote">On
Tue, Nov 7,
2017 at 9:06
AM, Sanjay
Patel <span dir="ltr"><<a href="mailto:spatel@rotateright.com" target="_blank">spatel@rotateright.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div>It's
clear from the
Intel docs how
this has
evolved, but
from a
compiler
perspective,
this isn't a
Skylake
"feature" :)
... nor an
Intel feature,
nor an x86
feature. <br>
<br>
It's a generic
programmer
hint for any
target with
multiple
potential
vector
lengths. <br>
</div>
<div><br>
</div>
<div>On x86,
there's
already a
potential use
case for this
hint with a
different
starting
motivation:
re-vectorization.
That's where
we take C code
that uses
128-bit vector
intrinsics and
selectively
widen it to
256- or
512-bit vector
ops based on a
newer CPU
target than
the code was
originally
written for.<br>
<div><br>
</div>
<div>I think
it's just a
matter of time
before a
customer
requests the
same ability
for another
target (maybe
they already
have and I
don't know
about it). So
we should have
a solution
that
recognizes
that
possibility. <br>
</div>
<div><br>
</div>
</div>
Note that
having a
target-independent
implementation
in the
optimizer
doesn't
preclude a
flag alias in
clang to
maintain
compatibility
with gcc.
<div>
<div class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-m_264012946301939527m_6454106954572217318m_771050129279988374m_4887027107317541871m_-9050519988835790991h5"><br>
<div><br>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On
Tue, Nov 7,
2017 at 2:02
AM, Tobias
Grosser via
llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On
Fri, Nov 3,
2017, at
05:47, Craig
Topper via
llvm-dev
wrote:<br>
> That's a
very good
point about
the ordering
of the command
line options.<br>
> gcc's
current
implementation
treats
-mprefer-avx256
has "prefer
256 over<br>
> 512" and
-mprefer-avx128 as "prefer 128 over 256". Which feels weird for<br>
> other
reasons, but
has less of an
ordering
ambiguity.<br>
><br>
>
-mprefer-avx128
has been in
gcc for many
years and
predates the
creation<br>
> of<br>
> avx512.
-mprefer-avx256
was added a
couple months
ago.<br>
><br>
> We've had
an internal
conversation
with the
implementor of<br>
>
-mprefer-avx256<br>
> in gcc
about making
-mprefer-avx128
affect 512-bit
vectors as
well. I'll<br>
> bring up
the ambiguity
issue with
them.<br>
><br>
> Do we
want to be
compatible
with gcc here?<br>
<br>
I certainly
believe we
would want to
be compatible
with gcc (if
we use<br>
the same
names).<br>
<br>
Best,<br>
Tobias<br>
<br>
><br>
> ~Craig<br>
><br>
> On Thu,
Nov 2, 2017 at
7:18 PM, Eric
Christopher
<<a href="mailto:echristo@gmail.com" target="_blank">echristo@gmail.com</a>><br>
> wrote:<br>
><br>
> ><br>
> ><br>
> > On
Thu, Nov 2,
2017 at 7:05
PM James Y
Knight via
llvm-dev <<br>
> > <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>
wrote:<br>
> ><br>
> >>
On Wed, Nov 1,
2017 at 7:35
PM, Craig
Topper via
llvm-dev <<br>
> >>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>
wrote:<br>
> >><br>
>
>>>
Hello all,<br>
>
>>><br>
>
>>><br>
>
>>><br>
>
>>> I
would like to
propose adding
the
-mprefer-avx256
and
-mprefer-avx128<br>
>
>>>
command line
flags
supported by
latest GCC to
clang. These
flags will be<br>
>
>>>
used to limit
the vector
register size
presented by
TTI to the
vectorizers.<br>
>
>>>
The backend
will still be
able to use
wider
registers for
code written<br>
>
>>>
using the
instrinsics in
x86intrin.h.
And the
backend will
still be able
to<br>
>
>>>
use AVX512VL
instructions
and the
additional
XMM16-31 and
YMM16-31<br>
>
>>>
registers.<br>
>
>>><br>
>
>>><br>
>
>>><br>
>
>>>
Motivation:<br>
>
>>><br>
>
>>>
-Using 512-bit
operations on
some Intel
CPUs may cause
a decrease in
CPU<br>
>
>>>
frequency that
may offset the
gains from
using the
wider register
size. See<br>
>
>>>
section 15.26
of Intel® 64
and IA-32
Architectures
Optimization
Reference<br>
>
>>>
Manual
published
October 2017.<br>
>
>>><br>
> >><br>
> >>
I note the doc
mentions that
256-bit AVX
operations
also have the
same<br>
> >>
issue with
reducing the
CPU frequency,
which is nice
to see
documented!<br>
> >><br>
> >>
There's also
the issues
discussed here
<<a href="http://www.agner.org/" rel="noreferrer" target="_blank">http://www.agner.org/</a><br>
> >>
optimize/blog/read.php?i=165> (and elsewhere) related to warm-up time<br>
> >>
for the
256-bit
execution
pipeline,
which is
another issue
with using<br>
> >>
wide-vector
ops.<br>
> >><br>
> >><br>
> >>
-The vector
ALUs on ports
0 and 1 of the
Skylake Server
microarchitecture<br>
>
>>>
are only
256-bits wide.
512-bit
instructions
using these
ALUs must use
both<br>
>
>>>
ports. See
section 2.1 of
Intel® 64 and
IA-32
Architectures
Optimization<br>
>
>>>
Reference
Manual
published
October 2017.<br>
>
>>><br>
> >><br>
> >><br>
>
>>>
Implementation
Plan:<br>
>
>>><br>
>
>>>
-Add
prefer-avx256
and
prefer-avx128
as
SubtargetFeatures
in X86.td not<br>
>
>>>
mapped to any
CPU.<br>
>
>>><br>
>
>>>
-Add
mprefer-avx256
and
mprefer-avx128
and the
corresponding<br>
>
>>>
-mno-prefer-avx128/256
options to
clang's driver
Options.td
file. I
believe<br>
>
>>>
this will
allow clang to
pass these
straight
through to the
-target-feature<br>
>
>>>
attribute in
IR.<br>
>
>>><br>
>
>>>
-Modify
X86TTIImpl::getRegisterBitWidth
to only return
512 if AVX512
is<br>
>
>>>
enabled and
prefer-avx256
and
prefer-avx128
is not set.
Similarly
return<br>
>
>>>
256 if AVX is
enabled and
prefer-avx128
is not set.<br>
>
>>><br>
> >><br>
> >>
Instead of
multiple flags
that have
difficult to
understand
intersecting<br>
> >>
behavior, one
flag with a
value would be
better. E.g.,
what should<br>
> >>
"-mprefer-avx256 -mprefer-avx128 -mno-prefer-avx256" do? No matter the<br>
> >>
answer, it's
confusing.
(Similarly
with other
such
combinations).
Just a<br>
> >>
single arg
"-mprefer-avx={128/256/512}"
(with no "no"
version) seems
easier<br>
> >>
to understand
to me (keeping
the same
behavior as
you mention:
asking to<br>
> >>
prefer a
larger width
than is
supported by
your
architecture
should be fine<br>
> >>
but ignored).<br>
> >><br>
> >><br>
> > I
agree with
this. It's a
little more
plumbing as
far as
subtarget<br>
> >
features etc
(represent via
an optional
value or just
various "set
the avx<br>
> >
width"
features - the
latter being
easier, but
uglier),
however, it's<br>
> >
probably the
right thing to
do.<br>
> ><br>
> > I
was looking at
this myself
just a couple
weeks ago and
think this is
the<br>
> >
right
direction
(when and how
to turn things
off) - and
probably makes<br>
> >
sense to be a
default for
these
architectures?
We might end
up needing to<br>
> >
check a couple
of additional
TTI places,
but it sounds
like you're on
top<br>
> > of
it. :)<br>
> ><br>
> >
Thanks very
much for doing
this work.<br>
> ><br>
> >
-eric<br>
> ><br>
> ><br>
> >><br>
> >><br>
> >>
There may be
some other
backend
changes
needed, but I
plan to
address<br>
>
>>>
those as we
find them.<br>
>
>>><br>
>
>>><br>
>
>>>
At a later
point,
consider
making
-mprefer-avx256
the default
for<br>
>
>>>
Skylake Server
due to the
above
mentioned
performance
considerations.<br>
>
>>><br>
> >><br>
> >><br>
> >><br>
> >><br>
> >><br>
>
>>><br>
> >>
Does this
sound
reasonable?<br>
>
>>><br>
>
>>><br>
>
>>><br>
>
>>>
*Latest Intel
Optimization
manual
available
here:<br>
>
>>> <a href="https://software.intel.com/en-us/articles/intel-sdm#optimization" rel="noreferrer" target="_blank">https://software.intel.com/en-us/articles/intel-sdm#optimization</a><br>
>
>>><br>
>
>>><br>
>
>>>
-Craig Topper<br>
>
>>><br>
>
>>>
_______________________________________________<br>
>
>>>
LLVM
Developers
mailing list<br>
>
>>> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
>
>>> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
>
>>><br>
>
>>>
_______________________________________________<br>
> >>
LLVM
Developers
mailing list<br>
> >>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
> >>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
> >><br>
> ><br>
>
_______________________________________________<br>
> LLVM
Developers
mailing list<br>
> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
_______________________________________________<br>
LLVM
Developers
mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
<br>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
<br>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-m_264012946301939527mimeAttachmentHeader"></fieldset>
<br>
<pre>_______________________________________________
LLVM Developers mailing list
<a class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-m_264012946301939527moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>
<a class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-m_264012946301939527moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>
</pre>
</blockquote>
<br>
</div>
</div>
<span class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-">
<pre class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067m_-2096253803562932609gmail-m_264012946301939527moz-signature" cols="72">--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
</span></div>
<br>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
<br>
</blockquote>
</div>
</div>
</div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote>
</div>
</div>
</blockquote>
<br>
<pre class="m_-5855807460524593739m_1852127268239199982m_-5672491955778672750m_-938083871188661067moz-signature" cols="72">--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory</pre>
</div></div></div>
</blockquote></div><br></div></div></div>
<br>_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
<br></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote></div></div>