<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, May 16, 2016 at 2:30 PM, Eric Christopher <span dir="ltr"><<a href="mailto:echristo@gmail.com">echristo@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><br><br><div class="gmail_quote"><span><div dir="ltr">On Mon, May 16, 2016 at 10:02 AM Nico Weber <<a href="mailto:thakis@chromium.org">thakis@chromium.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, May 16, 2016 at 12:48 PM, Eric Christopher <span dir="ltr"><<a href="mailto:echristo@gmail.com">echristo@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><br><br><div class="gmail_quote"><span><div dir="ltr">On Mon, May 16, 2016 at 9:43 AM Nico Weber via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr">> <span style="font-size:12.8px">Sorry if this is a stupid question, but do the windows intrinsic headers actually contain the same contents as clang's?</span><div><span style="font-size:12.8px"><br></span></div></div><div dir="ltr"><div><span style="font-size:12.8px">As far as I can tell (from looking at <a href="https://msdn.microsoft.com/en-us/library/hh977023.aspx">https://msdn.microsoft.com/en-us/library/hh977023.aspx</a> and comparing to clang's headers), yes. MSVC doesn't have the avx512 intrinsics yet, but that's probably only because they're new.</span></div><div><span style="font-size:12.8px"><br></span></div><div><span style="font-size:12.8px">I had hoped that I could not include all of x86intrin.h in intrin.h, but that page says "The intrin.h header includes both immintrin.h and ammintrin.h for simplicity."</span></div></div><div dir="ltr"><div><span style="font-size:12.8px"><br></span></div><div><span style="font-size:12.8px">> This old discussion may cover some of this as well?</span></div><div><span style="font-size:12.8px"><br></span></div></div><div dir="ltr"><div><span style="font-size:12.8px">Ah thanks, yes, sounds like there are reasons for not putting the includes back behind ifdefs. The thread doesn't really mention the reasons, and since clang doesn't implement full multiversioning yet I'm unable to guess at the reasons -- but it sounds like people don't want to re-add the arch ifdefs. Ok, I'll send a patch to add them back ifdef _MSC_VER only -- there should be no drawback to that, and it stops the bleeding in the case where it's worst (with Microsoft headers).</span></div><div><span style="font-size:12.8px"><br></span></div></div></blockquote><div><br></div></span><div>It implements enough multiversioning to make it worthwhile to have them, I don't know what you're confused about here.</div></div></div></blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>I didn't mean to question this point, I just don't understand it. Can you give an example where it's useful? I'm sure there is one, I just can't think of one.</div></div></div></div></blockquote><div><br></div></span><div>Sure, the programmer has to write their own dispatch, but it'll allow you to include variously target optimized versions of the same function in the same file.</div><div><br></div><div>The equivalent linux side of things is:</div><div><br></div><div>void my_avx_function() __attribute__((__target__("avx")))</div><div>void my_nonavx_function()</div><div><br></div><div>...</div><div><br></div><div>if (__builtin_cpu_supports("avx"))</div><div> my_avx_function()</div><div>else</div><div> my_nonavx_function()</div><div><br></div><div>and you can keep both implementations in the same file and don't have to worry about things like command line options being different and causing all sorts of haywire.</div></div></div></blockquote><div><br></div><div>Ah, thanks, I didn't know that worked :-) (I played with it a bit and found PR27779)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div> Why do we want to make this platform specific? If you wanted to match MSVC I guess you could just turn them off for windows as an alternate solution?</div></div></div></blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Yes, that's what I meant with the _MSC_VER check.</div></div></div></div></blockquote><div><br></div></span><div>I meant just turn off the avx512 headers, not sure if that's what you meant. We should probably look at the lexing and parsing code to see what can be sped up here.</div></div></div></blockquote><div><br></div><div>I turned it off for all headers for now in r269675. I agree that we should look at lexing and parsing speed, and also consider things like tablegen'ing intinsics. Until then, making most compiles faster seems like a better tradeoff on Windows.</div><div><br></div><div>As said above, I'm curious to hear from the people working on avx512 :-)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><span><font color="#888888"><div><br></div><div>-eric</div></font></span><div><div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><span><font color="#888888"><div><br></div><div>-eric</div></font></span><div><div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><span style="font-size:12.8px"></span></div><div><span style="font-size:12.8px">Going forward, we'll have to teach clang more about at least some intrinsics for `#pragma intrin` (PR19898), which might end up helping for this too.</span></div><div><span style="font-size:12.8px"><br></span></div><div><span style="font-size:12.8px">I also reached out to STL at Microsoft, he said he'll try to look into including an "intrin0.h" header in the next major version of MSVC which would only declare a small set of intrinsics instead of all of them (no promises, of course).</span></div><div><span style="font-size:12.8px"><br></span></div><div><span style="font-size:12.8px">People working on avx512, I'd be curious to hear your perspective on this, as well as your reply to Chandler's points.</span></div><div><span style="font-size:12.8px"><br></span></div><div><span style="font-size:12.8px">Thanks,</span></div><div><span style="font-size:12.8px">Nico</span></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, May 14, 2016 at 2:04 AM, Chandler Carruth via cfe-dev <span dir="ltr"><<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr">A couple of points:<div><br></div><div>1) Definitely agree with Hal that these intrinsics really shouldn't be mapping to builtins. This is something I'm pretty frustrated about the direction of AVX-512 support in Clang and LLVM. We really need generic vector IR to lower cleanly into these instructions.</div><div><br></div><div>2) Reid, you specifically advocated for not having the set of intrinsics available based on particular feature sets. ;] But I agree there seems to be a scalability problem here.</div><div><br></div><div>3) I think a lot of the scalability problem is that very basic, non-vector code patterns, require Intrin.h on Windows and pull in *ALL* the vector intrinsics. =/ It'd be really good to try to fix *that*.</div><div><br></div><div>4) AVX-512 has made this *incredibly* worse than any previous ISA extension. It used to be we had the product of (operation * operand-type) intrinsics. This is already pretty bad. Now we have (operation * operand-type * 4) because we have 4 masking variants. So it seems Intel has just made a really unfortunate API choice by forcing every permutation of these things to get a different name and thus a different intrinsic in a header file. =/ And sadly that too is probably too late to walk back.</div><div><br></div><div><br></div><div>I wonder if we could at least initially address this by providing very limited "builtin" modules for truly builtin headers that don't touch any system headers, and actually *always* use the modules approach for these headers, right out of the box.</div></div><div><div><br><div class="gmail_quote"><div dir="ltr">On Fri, May 13, 2016 at 7:32 PM C Bergström <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">This old discussion may cover some of this as well? I also thought I<br>
remember something more recent around this..<br>
<a href="http://clang-developers.42468.n3.nabble.com/PROPOSAL-Reintroduce-guards-for-Intel-intrinsic-headers-td4046979.html" rel="noreferrer">http://clang-developers.42468.n3.nabble.com/PROPOSAL-Reintroduce-guards-for-Intel-intrinsic-headers-td4046979.html</a><br>
<br>
On Sat, May 14, 2016 at 8:59 AM, Sean Silva via cfe-dev<br>
<<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br>
> Sorry if this is a stupid question, but do the windows intrinsic headers<br>
> actually contain the same contents as clang's? (e.g. maybe the windows ones<br>
> don't cover all the ISA's that clang's do).<br>
><br>
> -- Sean Silva<br>
><br>
> On Thu, May 12, 2016 at 9:16 AM, Nico Weber via cfe-dev<br>
> <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br>
>><br>
>> Hi,<br>
>><br>
>> on Windows, C++ system headers like e.g. <string> end up pulling in<br>
>> intrin.h. clang's intrinsic headers are very large.<br>
>><br>
>> If you take a cc file containing just `#include <string>` and run that<br>
>> through the preprocessor with `cl /P test.cc` and `clang-cl /P test.cc`, the<br>
>> test.I file generated by clang-cl is 1.7MB while the one created by cl.exe<br>
>> is 0.7MB. This is solely due to clang's intrin.h expanding to way more<br>
>> stuff.<br>
>><br>
>> The biggest offenders are avx512vlintrin.h, avx512fintrin.h,<br>
>> avx512vlbwintrin.h which add up to 657kB already. Before r239883, we only<br>
>> included avx headers if __AVX512F__ etc was defined. This is currently never<br>
>> the case in practice. Later (r243394 r243402 r243406 and more), the avx<br>
>> headers got much bigger.<br>
>><br>
>> Parsing all this code takes time -- removing the avx512 includes from<br>
>> immintrin.h locally makes compiling a file containing just the <string><br>
>> header 0.25s faster (!), and building all of v8 gets 6% faster, just from<br>
>> not including the avx512 headers.<br>
>><br>
>> What can we do about this? Since avx512 is new, maybe they could be not<br>
>> part of immintrin.h? Or we could re-introduce<br>
>><br>
>> #if !__has_feature(modules) && defined(__AVX512BW__)<br>
>><br>
>> include guards in immintrin.h. This would give us a speed win immediately<br>
>> without drawbacks as far as I can see, but in a few years when people start<br>
>> compiling with /arch:avx512 that'd go away again. (Then again, by then,<br>
>> modules are hopefully commonly available. cl.exe doesn't have an<br>
>> /arch:avx512 switch yet, so this is probably several years away from<br>
>> happening.)<br>
>><br>
>> Comments? Is it feasible to require that people who want to use avx512<br>
>> include a new header instead of immintrin.h? Else, does anyone have a better<br>
>> idea other than reintroducing the #ifdefs, augmented with the module check?<br>
>><br>
>> Thanks,<br>
>> Nico<br>
>><br>
>> _______________________________________________<br>
>> cfe-dev mailing list<br>
>> <a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a><br>
>> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
>><br>
><br>
><br>
> _______________________________________________<br>
> cfe-dev mailing list<br>
> <a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a><br>
> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
><br>
_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
</blockquote></div>
</div></div><br>_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
<br></blockquote></div><br></div>
_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
</blockquote></div></div></div></div>
</blockquote></div></div></div></blockquote></div></div></div></div>
</blockquote></div><br></div></div>