[cfe-dev] The intrinsics headers (especially avx512) are too big. What to do about it?

Eric Christopher via cfe-dev cfe-dev at lists.llvm.org
Mon May 16 09:48:12 PDT 2016


On Mon, May 16, 2016 at 9:43 AM Nico Weber via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> > Sorry if this is a stupid question, but do the windows intrinsic
> headers actually contain the same contents as clang's?
>
> As far as I can tell (from looking at
> https://msdn.microsoft.com/en-us/library/hh977023.aspx and comparing to
> clang's headers), yes. MSVC doesn't have the avx512 intrinsics yet, but
> that's probably only because they're new.
>
> I had hoped that I could not include all of x86intrin.h in intrin.h,
> but that page says "The intrin.h header includes both immintrin.h and
> ammintrin.h for simplicity."
>
> > This old discussion may cover some of this as well?
>
> Ah thanks, yes, sounds like there are reasons for not putting the includes
> back behind ifdefs. The thread doesn't really mention the reasons, and
> since clang doesn't implement full multiversioning yet I'm unable to guess
> at the reasons -- but it sounds like people don't want to re-add the arch
> ifdefs. Ok, I'll send a patch to add them back ifdef _MSC_VER only -- there
> should be no drawback to that, and it stops the bleeding in the case where
> it's worst (with Microsoft headers).
>
>
It implements enough multiversioning to make it worthwhile to have them, I
don't know what you're confused about here. Why do we want to make this
platform specific? If you wanted to match MSVC I guess you could just turn
them off for windows as an alternate solution?

-eric


> Going forward, we'll have to teach clang more about at least some
> intrinsics for `#pragma intrin` (PR19898), which might end up helping for
> this too.
>
> I also reached out to STL at Microsoft, he said he'll try to look into
> including an "intrin0.h" header in the next major version of MSVC which
> would only declare a small set of intrinsics instead of all of them (no
> promises, of course).
>
> People working on avx512, I'd be curious to hear your perspective on this,
> as well as your reply to Chandler's points.
>
> Thanks,
> Nico
>
> On Sat, May 14, 2016 at 2:04 AM, Chandler Carruth via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>> A couple of points:
>>
>> 1) Definitely agree with Hal that these intrinsics really shouldn't be
>> mapping to builtins. This is something I'm pretty frustrated about the
>> direction of AVX-512 support in Clang and LLVM. We really need generic
>> vector IR to lower cleanly into these instructions.
>>
>> 2) Reid, you specifically advocated for not having the set of intrinsics
>> available based on particular feature sets. ;] But I agree there seems to
>> be a scalability problem here.
>>
>> 3) I think a lot of the scalability problem is that very basic,
>> non-vector code patterns, require Intrin.h on Windows and pull in *ALL* the
>> vector intrinsics. =/ It'd be really good to try to fix *that*.
>>
>> 4) AVX-512 has made this *incredibly* worse than any previous ISA
>> extension. It used to be we had the product of (operation * operand-type)
>> intrinsics. This is already pretty bad. Now we have (operation *
>> operand-type * 4) because we have 4 masking variants. So it seems Intel has
>> just made a really unfortunate API choice by forcing every permutation of
>> these things to get a different name and thus a different intrinsic in a
>>  header file. =/ And sadly that too is probably too late to walk back.
>>
>>
>> I wonder if we could at least initially address this by providing very
>> limited "builtin" modules for truly builtin headers that don't touch any
>> system headers, and actually *always* use the modules approach for these
>> headers, right out of the box.
>>
>> On Fri, May 13, 2016 at 7:32 PM C Bergström <cfe-dev at lists.llvm.org>
>> wrote:
>>
>>> This old discussion may cover some of this as well? I also thought I
>>> remember something more recent around this..
>>>
>>> http://clang-developers.42468.n3.nabble.com/PROPOSAL-Reintroduce-guards-for-Intel-intrinsic-headers-td4046979.html
>>>
>>> On Sat, May 14, 2016 at 8:59 AM, Sean Silva via cfe-dev
>>> <cfe-dev at lists.llvm.org> wrote:
>>> > Sorry if this is a stupid question, but do the windows intrinsic
>>> headers
>>> > actually contain the same contents as clang's? (e.g. maybe the windows
>>> ones
>>> > don't cover all the ISA's that clang's do).
>>> >
>>> > -- Sean Silva
>>> >
>>> > On Thu, May 12, 2016 at 9:16 AM, Nico Weber via cfe-dev
>>> > <cfe-dev at lists.llvm.org> wrote:
>>> >>
>>> >> Hi,
>>> >>
>>> >> on Windows, C++ system headers like e.g. <string> end up pulling in
>>> >> intrin.h. clang's intrinsic headers are very large.
>>> >>
>>> >> If you take a cc file containing just `#include <string>` and run that
>>> >> through the preprocessor with `cl /P test.cc` and `clang-cl /P
>>> test.cc`, the
>>> >> test.I file generated by clang-cl is 1.7MB while the one created by
>>> cl.exe
>>> >> is 0.7MB. This is solely due to clang's intrin.h expanding to way more
>>> >> stuff.
>>> >>
>>> >> The biggest offenders are avx512vlintrin.h, avx512fintrin.h,
>>> >> avx512vlbwintrin.h which add up to 657kB already. Before r239883, we
>>> only
>>> >> included avx headers if __AVX512F__ etc was defined. This is
>>> currently never
>>> >> the case in practice. Later (r243394 r243402 r243406 and more), the
>>> avx
>>> >> headers got much bigger.
>>> >>
>>> >> Parsing all this code takes time -- removing the avx512 includes from
>>> >> immintrin.h locally makes compiling a file containing just the
>>> <string>
>>> >> header 0.25s faster (!), and building all of v8 gets 6% faster, just
>>> from
>>> >> not including the avx512 headers.
>>> >>
>>> >> What can we do about this? Since avx512 is new, maybe they could be
>>> not
>>> >> part of immintrin.h? Or we could re-introduce
>>> >>
>>> >>   #if !__has_feature(modules) && defined(__AVX512BW__)
>>> >>
>>> >> include guards in immintrin.h. This would give us a speed win
>>> immediately
>>> >> without drawbacks as far as I can see, but in a few years when people
>>> start
>>> >> compiling with /arch:avx512 that'd go away again. (Then again, by
>>> then,
>>> >> modules are hopefully commonly available. cl.exe doesn't have an
>>> >> /arch:avx512 switch yet, so this is probably several years away from
>>> >> happening.)
>>> >>
>>> >> Comments? Is it feasible to require that people who want to use avx512
>>> >> include a new header instead of immintrin.h? Else, does anyone have a
>>> better
>>> >> idea other than reintroducing the #ifdefs, augmented with the module
>>> check?
>>> >>
>>> >> Thanks,
>>> >> Nico
>>> >>
>>> >> _______________________________________________
>>> >> cfe-dev mailing list
>>> >> cfe-dev at lists.llvm.org
>>> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>> >>
>>> >
>>> >
>>> > _______________________________________________
>>> > cfe-dev mailing list
>>> > cfe-dev at lists.llvm.org
>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>> >
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160516/3ff5e647/attachment.html>


More information about the cfe-dev mailing list