[cfe-dev] The intrinsics headers (especially avx512) are too big. What to do about it?

Eric Christopher via cfe-dev cfe-dev at lists.llvm.org
Tue Jun 14 13:53:23 PDT 2016


So have clang magically emit the generated code based on the intrinsic
header? That'll be a lot of typing, but ultimately shouldn't be terrible.
You'll effectively turn the _mm_ interface into __builtin as far as
automatic recognition etc and I'm not sure we'd want to do that sort of
thing.

-eric

On Tue, Jun 14, 2016 at 6:43 AM Demikhovsky, Elena via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> We are still trying to find a suitable solution.
>
> Keeping declarations only inside header files will save compile time.
>
> In this case the implementation will be hidden inside clang.
>
> Can somebody help me to estimate impact and complexity of this solution?
>
>
>
> Thank you.
>
>
>
> -          * Elena*
>
>
>
> *From:* thakis at google.com [mailto:thakis at google.com] *On Behalf Of *Nico
> Weber
> *Sent:* Tuesday, June 14, 2016 15:50
> *To:* Demikhovsky, Elena <elena.demikhovsky at intel.com>
> *Cc:* Hal Finkel <hfinkel at anl.gov>; Reid Kleckner <rnk at google.com>;
> cfe-dev <cfe-dev at lists.llvm.org>; Badouh, Asaf <asaf.badouh at intel.com>;
> Zuckerman, Michael <michael.zuckerman at intel.com>; David Majnemer <
> majnemer at google.com>; Chandler Carruth (chandlerc at google.com) <
> chandlerc at google.com>
>
>
> *Subject:* Re: [cfe-dev] The intrinsics headers (especially avx512) are
> too big. What to do about it?
>
>
>
> On Tue, May 17, 2016 at 3:49 PM, Demikhovsky, Elena <
> elena.demikhovsky at intel.com> wrote:
>
>    >Indeed. It is not clear to me, however, that this situation is
> desirable. We
>    >had a general policy that our intrinsics headers should generate
> generic IR
>    >whenever possible, and if we've strayed from that, we should discuss
> that
>    >first.
>
> Let's take a look at this intrinsic:
>
> static __inline__ __m512i __DEFAULT_FN_ATTRS
> _mm512_mask_add_epi64 (__m512i __W, __mmask8 __U, __m512i __A, __m512i __B)
> {
>   return (__m512i) __builtin_ia32_paddq512_mask ((__v8di) __A,
>              (__v8di) __B,
>              (__v8di) __W,
>              (__mmask8) __U);
> }
>
> The IR that should be generated:
> %C = add <8 x double> %B, %A
> %res = select <8 x i1> %mask, <8 x double> %C, %W
>
> If we parse __builtin_ia32_paddq512_mask in CGBuiltin.cpp and generate IR
> there, will it help?
>
> (Please do not consider my question as a general Intel solution. I just
> want to understand the problem.)
>
>
> The bit I care most about is that adding `#include <intrin.h>` shouldn't
> add megabytes of stuff to my translation unit.
>
>
>
> Hve you discussed making immintrin.h more modular? It looks like many more
> avx512 builtins keep landing, making this problem bigger and bigger. It'd
> be good if I only had to pay for this if I explicitly included an avx512.h,
> and even then it'd be nice if that wasn't one huge header, but several
> smaller ones, so I only have to pay compile time for the bits I need.
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160614/a398b8d0/attachment.html>


More information about the cfe-dev mailing list