[cfe-dev] The intrinsics headers (especially avx512) are too big. What to do about it?

Benyei, Guy via cfe-dev cfe-dev at lists.llvm.org
Mon Jun 20 11:49:10 PDT 2016


As far as I know, the serialized AST is extremely sensitive. You’re right, builtins are introduced and removed with the header updates, but in worst case you will get a quite clear error message if a builtin is missing. There are also many changes that don’t touch the intrinsic headers and the builtin definitions. On the other hand, relatively small changes may break the AST, and the result may be an unexplained crash.

Thanks
[signature]

From: James Y Knight [mailto:jyknight at google.com]
Sent: Monday, June 20, 2016 19:04
To: Benyei, Guy <guy.benyei at intel.com>
Cc: Chandler Carruth <chandlerc at google.com>; Eric Christopher <echristo at gmail.com>; Demikhovsky, Elena <elena.demikhovsky at intel.com>; Nico Weber <thakis at chromium.org>; cfe-dev at lists.llvm.org; David Majnemer <majnemer at google.com>; Badouh, Asaf <asaf.badouh at intel.com>; Zuckerman, Michael <michael.zuckerman at intel.com>
Subject: Re: [cfe-dev] The intrinsics headers (especially avx512) are too big. What to do about it?

It's already the case that the compiler needs to be in almost-exact-sync with the intrinsics headers, due to the use of internal builtins functions which seem to be typically introduced or removed at the same time as updating the header. I don't think modules would make this any worse than it is already?

On Mon, Jun 20, 2016 at 6:52 AM, Benyei, Guy via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
Modules could be a good solution for this issue, however I’m a bit concerned about some technical issues with keeping the modules in sync with the Clang compiler.

Building the actual module files as part of the build system doesn’t seem really hard, but still need to think about the location it should be installed (with the headers?), and need to make sure, that for any Clang we run we can find the suiting module files, or fall back to the headers approach.

We could add the module as some kind of resource to the executable. This way it couldn’t get out of sync. I would also add some way to disable this optimization entirely.

What do you think?

[signature]

From: Chandler Carruth [mailto:chandlerc at google.com<mailto:chandlerc at google.com>]
Sent: Thursday, June 16, 2016 00:20
To: Benyei, Guy <guy.benyei at intel.com<mailto:guy.benyei at intel.com>>; Eric Christopher <echristo at gmail.com<mailto:echristo at gmail.com>>; Demikhovsky, Elena <elena.demikhovsky at intel.com<mailto:elena.demikhovsky at intel.com>>; Nico Weber <thakis at chromium.org<mailto:thakis at chromium.org>>; cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
Cc: David Majnemer <majnemer at google.com<mailto:majnemer at google.com>>; Badouh, Asaf <asaf.badouh at intel.com<mailto:asaf.badouh at intel.com>>; Zuckerman, Michael <michael.zuckerman at intel.com<mailto:michael.zuckerman at intel.com>>

Subject: Re: [cfe-dev] The intrinsics headers (especially avx512) are too big. What to do about it?

As I said up the thread, I think the *right* way to solve this is with modules.

We have the infrastructure in clang to lazily load things like the intrinsics headers in a very efficient way. All we are missing is:
1) The ability to enable this by default exclusively for the intrinsic headers. (or more generally for any subset of the builtin headers where we would like this behavior...)
2) To build the actual module files for the builtin headers (much the way we generate some of them) as part of the build system

So far I've not seen any suggestions that really seem superior to this...

On Wed, Jun 15, 2016 at 1:23 PM Benyei, Guy via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
I agree, that it’s not desirable thing in Clang, however it seems to me the lesser evil.

The long compilation time will hurt many projects, I’m not sure if it’s reasonable even for the ones trying to use AVX intrinsics intentionally.
Another issue is, that the “_mm_” prefixed identifiers are not reserved for the compiler - the C99 spec says that identifiers that start with two underscores or an underscore and a capital letter are reserved. So, Clang should recognize the “_mm_” prefix, check if the right header was indeed included, and then try to identify the x86 intrinsic. If this fails, the identifier should be considered a standard identifier.

Of course, there is the case of x86 intrinsics that should be compiled to pure LLVM IR, rather than LLVM intrinsic calls. I see two possible solutions:


1.       Make CGBuiltin.cpp/EmitX86BuiltinExpr generate the IR using the IR builder. This approach might be less intuitive, and may become very long as we change more and more intrinsics to pure LLVM IR

2.       Leave the intrinsics implemented in C language in the header, rather than making these “_mm_” builtins. Then, again, as we move more and more intrinsics to C representation, the header might get big and heavy again.

I think in the short term I would prefer the 2nd solution as for its simplicity. Any other ideas to overcome this issue?

Thanks
     Guy Benyei



From: cfe-dev [mailto:cfe-dev-bounces at lists.llvm.org<mailto:cfe-dev-bounces at lists.llvm.org>] On Behalf Of Eric Christopher via cfe-dev

Sent: Tuesday, June 14, 2016 23:53
To: Demikhovsky, Elena <elena.demikhovsky at intel.com<mailto:elena.demikhovsky at intel.com>>; Nico Weber <thakis at chromium.org<mailto:thakis at chromium.org>>
Cc: cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>>; David Majnemer <majnemer at google.com<mailto:majnemer at google.com>>; Badouh, Asaf <asaf.badouh at intel.com<mailto:asaf.badouh at intel.com>>; Zuckerman, Michael <michael.zuckerman at intel.com<mailto:michael.zuckerman at intel.com>>

Subject: Re: [cfe-dev] The intrinsics headers (especially avx512) are too big. What to do about it?

So have clang magically emit the generated code based on the intrinsic header? That'll be a lot of typing, but ultimately shouldn't be terrible. You'll effectively turn the _mm_ interface into __builtin as far as automatic recognition etc and I'm not sure we'd want to do that sort of thing.

-eric

On Tue, Jun 14, 2016 at 6:43 AM Demikhovsky, Elena via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
We are still trying to find a suitable solution.
Keeping declarations only inside header files will save compile time.
In this case the implementation will be hidden inside clang.
Can somebody help me to estimate impact and complexity of this solution?

Thank you.

-           Elena

From: thakis at google.com<mailto:thakis at google.com> [mailto:thakis at google.com<mailto:thakis at google.com>] On Behalf Of Nico Weber
Sent: Tuesday, June 14, 2016 15:50
To: Demikhovsky, Elena <elena.demikhovsky at intel.com<mailto:elena.demikhovsky at intel.com>>
Cc: Hal Finkel <hfinkel at anl.gov<mailto:hfinkel at anl.gov>>; Reid Kleckner <rnk at google.com<mailto:rnk at google.com>>; cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>>; Badouh, Asaf <asaf.badouh at intel.com<mailto:asaf.badouh at intel.com>>; Zuckerman, Michael <michael.zuckerman at intel.com<mailto:michael.zuckerman at intel.com>>; David Majnemer <majnemer at google.com<mailto:majnemer at google.com>>; Chandler Carruth (chandlerc at google.com<mailto:chandlerc at google.com>) <chandlerc at google.com<mailto:chandlerc at google.com>>

Subject: Re: [cfe-dev] The intrinsics headers (especially avx512) are too big. What to do about it?

On Tue, May 17, 2016 at 3:49 PM, Demikhovsky, Elena <elena.demikhovsky at intel.com<mailto:elena.demikhovsky at intel.com>> wrote:
   >Indeed. It is not clear to me, however, that this situation is desirable. We
   >had a general policy that our intrinsics headers should generate generic IR
   >whenever possible, and if we've strayed from that, we should discuss that
   >first.

Let's take a look at this intrinsic:

static __inline__ __m512i __DEFAULT_FN_ATTRS
_mm512_mask_add_epi64 (__m512i __W, __mmask8 __U, __m512i __A, __m512i __B)
{
  return (__m512i) __builtin_ia32_paddq512_mask ((__v8di) __A,
             (__v8di) __B,
             (__v8di) __W,
             (__mmask8) __U);
}

The IR that should be generated:
%C = add <8 x double> %B, %A
%res = select <8 x i1> %mask, <8 x double> %C, %W

If we parse __builtin_ia32_paddq512_mask in CGBuiltin.cpp and generate IR there, will it help?

(Please do not consider my question as a general Intel solution. I just want to understand the problem.)

The bit I care most about is that adding `#include <intrin.h>` shouldn't add megabytes of stuff to my translation unit.

Hve you discussed making immintrin.h more modular? It looks like many more avx512 builtins keep landing, making this problem bigger and bigger. It'd be good if I only had to pay for this if I explicitly included an avx512.h, and even then it'd be nice if that wasn't one huge header, but several smaller ones, so I only have to pay compile time for the bits I need.


---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160620/2de09a2f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 17591 bytes
Desc: image001.png
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160620/2de09a2f/attachment.png>


More information about the cfe-dev mailing list