[cfe-dev] The intrinsics headers (especially avx512) are too big. What to do about it?

Nico Weber via cfe-dev cfe-dev at lists.llvm.org
Mon May 16 10:02:08 PDT 2016


On Mon, May 16, 2016 at 12:48 PM, Eric Christopher <echristo at gmail.com>
wrote:

>
>
> On Mon, May 16, 2016 at 9:43 AM Nico Weber via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>> > Sorry if this is a stupid question, but do the windows intrinsic
>> headers actually contain the same contents as clang's?
>>
>> As far as I can tell (from looking at
>> https://msdn.microsoft.com/en-us/library/hh977023.aspx and comparing to
>> clang's headers), yes. MSVC doesn't have the avx512 intrinsics yet, but
>> that's probably only because they're new.
>>
>> I had hoped that I could not include all of x86intrin.h in intrin.h,
>> but that page says "The intrin.h header includes both immintrin.h and
>> ammintrin.h for simplicity."
>>
>> > This old discussion may cover some of this as well?
>>
>> Ah thanks, yes, sounds like there are reasons for not putting the
>> includes back behind ifdefs. The thread doesn't really mention the reasons,
>> and since clang doesn't implement full multiversioning yet I'm unable to
>> guess at the reasons -- but it sounds like people don't want to re-add the
>> arch ifdefs. Ok, I'll send a patch to add them back ifdef _MSC_VER only --
>> there should be no drawback to that, and it stops the bleeding in the case
>> where it's worst (with Microsoft headers).
>>
>>
> It implements enough multiversioning to make it worthwhile to have them, I
> don't know what you're confused about here.
>

I didn't mean to question this point, I just don't understand it. Can you
give an example where it's useful? I'm sure there is one, I just can't
think of one.


> Why do we want to make this platform specific? If you wanted to match MSVC
> I guess you could just turn them off for windows as an alternate solution?
>

Yes, that's what I meant with the _MSC_VER check.


>
> -eric
>
>
>> Going forward, we'll have to teach clang more about at least some
>> intrinsics for `#pragma intrin` (PR19898), which might end up helping for
>> this too.
>>
>> I also reached out to STL at Microsoft, he said he'll try to look into
>> including an "intrin0.h" header in the next major version of MSVC which
>> would only declare a small set of intrinsics instead of all of them (no
>> promises, of course).
>>
>> People working on avx512, I'd be curious to hear your perspective on
>> this, as well as your reply to Chandler's points.
>>
>> Thanks,
>> Nico
>>
>> On Sat, May 14, 2016 at 2:04 AM, Chandler Carruth via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>>
>>> A couple of points:
>>>
>>> 1) Definitely agree with Hal that these intrinsics really shouldn't be
>>> mapping to builtins. This is something I'm pretty frustrated about the
>>> direction of AVX-512 support in Clang and LLVM. We really need generic
>>> vector IR to lower cleanly into these instructions.
>>>
>>> 2) Reid, you specifically advocated for not having the set of intrinsics
>>> available based on particular feature sets. ;] But I agree there seems to
>>> be a scalability problem here.
>>>
>>> 3) I think a lot of the scalability problem is that very basic,
>>> non-vector code patterns, require Intrin.h on Windows and pull in *ALL* the
>>> vector intrinsics. =/ It'd be really good to try to fix *that*.
>>>
>>> 4) AVX-512 has made this *incredibly* worse than any previous ISA
>>> extension. It used to be we had the product of (operation * operand-type)
>>> intrinsics. This is already pretty bad. Now we have (operation *
>>> operand-type * 4) because we have 4 masking variants. So it seems Intel has
>>> just made a really unfortunate API choice by forcing every permutation of
>>> these things to get a different name and thus a different intrinsic in a
>>>  header file. =/ And sadly that too is probably too late to walk back.
>>>
>>>
>>> I wonder if we could at least initially address this by providing very
>>> limited "builtin" modules for truly builtin headers that don't touch any
>>> system headers, and actually *always* use the modules approach for these
>>> headers, right out of the box.
>>>
>>> On Fri, May 13, 2016 at 7:32 PM C Bergström <cfe-dev at lists.llvm.org>
>>> wrote:
>>>
>>>> This old discussion may cover some of this as well? I also thought I
>>>> remember something more recent around this..
>>>>
>>>> http://clang-developers.42468.n3.nabble.com/PROPOSAL-Reintroduce-guards-for-Intel-intrinsic-headers-td4046979.html
>>>>
>>>> On Sat, May 14, 2016 at 8:59 AM, Sean Silva via cfe-dev
>>>> <cfe-dev at lists.llvm.org> wrote:
>>>> > Sorry if this is a stupid question, but do the windows intrinsic
>>>> headers
>>>> > actually contain the same contents as clang's? (e.g. maybe the
>>>> windows ones
>>>> > don't cover all the ISA's that clang's do).
>>>> >
>>>> > -- Sean Silva
>>>> >
>>>> > On Thu, May 12, 2016 at 9:16 AM, Nico Weber via cfe-dev
>>>> > <cfe-dev at lists.llvm.org> wrote:
>>>> >>
>>>> >> Hi,
>>>> >>
>>>> >> on Windows, C++ system headers like e.g. <string> end up pulling in
>>>> >> intrin.h. clang's intrinsic headers are very large.
>>>> >>
>>>> >> If you take a cc file containing just `#include <string>` and run
>>>> that
>>>> >> through the preprocessor with `cl /P test.cc` and `clang-cl /P
>>>> test.cc`, the
>>>> >> test.I file generated by clang-cl is 1.7MB while the one created by
>>>> cl.exe
>>>> >> is 0.7MB. This is solely due to clang's intrin.h expanding to way
>>>> more
>>>> >> stuff.
>>>> >>
>>>> >> The biggest offenders are avx512vlintrin.h, avx512fintrin.h,
>>>> >> avx512vlbwintrin.h which add up to 657kB already. Before r239883, we
>>>> only
>>>> >> included avx headers if __AVX512F__ etc was defined. This is
>>>> currently never
>>>> >> the case in practice. Later (r243394 r243402 r243406 and more), the
>>>> avx
>>>> >> headers got much bigger.
>>>> >>
>>>> >> Parsing all this code takes time -- removing the avx512 includes from
>>>> >> immintrin.h locally makes compiling a file containing just the
>>>> <string>
>>>> >> header 0.25s faster (!), and building all of v8 gets 6% faster, just
>>>> from
>>>> >> not including the avx512 headers.
>>>> >>
>>>> >> What can we do about this? Since avx512 is new, maybe they could be
>>>> not
>>>> >> part of immintrin.h? Or we could re-introduce
>>>> >>
>>>> >>   #if !__has_feature(modules) && defined(__AVX512BW__)
>>>> >>
>>>> >> include guards in immintrin.h. This would give us a speed win
>>>> immediately
>>>> >> without drawbacks as far as I can see, but in a few years when
>>>> people start
>>>> >> compiling with /arch:avx512 that'd go away again. (Then again, by
>>>> then,
>>>> >> modules are hopefully commonly available. cl.exe doesn't have an
>>>> >> /arch:avx512 switch yet, so this is probably several years away from
>>>> >> happening.)
>>>> >>
>>>> >> Comments? Is it feasible to require that people who want to use
>>>> avx512
>>>> >> include a new header instead of immintrin.h? Else, does anyone have
>>>> a better
>>>> >> idea other than reintroducing the #ifdefs, augmented with the module
>>>> check?
>>>> >>
>>>> >> Thanks,
>>>> >> Nico
>>>> >>
>>>> >> _______________________________________________
>>>> >> cfe-dev mailing list
>>>> >> cfe-dev at lists.llvm.org
>>>> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>> >>
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > cfe-dev mailing list
>>>> > cfe-dev at lists.llvm.org
>>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>> >
>>>> _______________________________________________
>>>> cfe-dev mailing list
>>>> cfe-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>
>>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160516/998ee70d/attachment.html>


More information about the cfe-dev mailing list