[cfe-dev] [PROPOSAL] Reintroduce guards for Intel intrinsic headers

Eric Christopher echristo at gmail.com
Mon Aug 3 09:59:32 PDT 2015


Good question, I'm working on the diagnostics and what I think should
happen. It's not directly compatible with the gcc ones and it'll be a bit
tricky, but I think possible.

Responding to Justin inline here:


On Sun, Aug 2, 2015 at 5:09 PM Chandler Carruth <chandlerc at google.com>
wrote:

> Would cherrypicking the diagnostics to the 3.7 branch be better or worse?
> (I'm of two minds, curious what others think...)
>
> On Sun, Aug 2, 2015 at 12:00 PM Justin Bogner <mail at justinbogner.com>
> wrote:
>
>> Eric Christopher <echristo at gmail.com> writes:
>> > On Thu, Jul 30, 2015 at 10:12 AM Reid Kleckner <rnk at google.com> wrote:
>> >
>> >     I'm opposed to this. Going forward, I would really like target
>> intrinsics
>> >     to be available regardless of the current feature set, so users
>> don't need
>> >     hacks like these.
>> >
>> > Agreed.
>> >
>> >
>> >     I see two ways to do this with different tradeoffs:
>> >     1. Diagnose missing target attributes when calling the intel
>> intrinsics. I
>> >     was surprised to find that we don't already do this.
>> >
>> > Sorry. This is on my list of things to do.
>>
>> +hans
>>
>> I agree with the direction of moving to use target attributes instead of
>> relying on flaky ifdefs, but without any errors or warnings here this is
>> a pretty serious diagnostic regression.
>>
>>
As I said in the commit that added the rest of the target support, if we
can't code generate we will error, it's not a good error, but it's not
going to silently generate bad code.



> I think we should revert this on the 3.7 branch. It can stay as is on
>> trunk assuming the diagnostics are coming soon.
>>
>>
The type of warning we generate is an interesting question, I'll handle
this at the bottom.


> Right now we end up in spaces where we get crashes in the backend
>> instead of a sensible error in far too many situations. Notably:
>>
>>   https://llvm.org/bugs/show_bug.cgi?id=24125
>>   https://llvm.org/bugs/show_bug.cgi?id=24087
>>   https://llvm.org/bugs/show_bug.cgi?id=24335
>>
>>
I'll point out that two of these are people from the same company with a
similar testcase. So it's not quite as widespread as you'd think. Basically
they had a diagnostics test go awry. The last one is definitely a
regression in diagnostic quality, but not bad code.


> Additionally, I'm told this causes issues with configure scripts
>> misdetecting available features, as well as strange compatibility issues
>> like the one that led to this thread.
>>
>>
Yeah, this is crap IMO. Anyone that's doing the things you mentioned is
actually doing it wrong. A configure check that's going for a preprocessor
warning rather than an actual compile/execute test? That's asking for some
pain. There are better ways to do it.


> This feature is woefully incomplete. We need the warnings/errors for it
>> to be acceptable quality.
>>
>>
This is a bit hyperbolic, but I agree that it's a regression. I meant to
get the warning in much sooner than this and for that I apologize. Working
through how we do the warning is interesting though.

Let's take gcc as an example:

#include <immintrin.h>

int foo(__m256i a) {
  return _mm256_extract_epi32(a, 3);
}
extern  __m256i a;
int bar() {
  return foo(a);
}

Here, we'll warn in foo for the fact that the builtin (builtin? there's no
builtin in my code here :) needs -m32 (ok, that's just a bug, I should
report that, so -mavx) to compile:

In file included from .../immintrin.h:41:0,
                 from foo.c:1:
foo.c:4:10: error: '__builtin_ia32_vextractf128_si256' needs isa option -m32
   return _mm256_extract_epi32(a, 3);

which is a little weird, but ok.

Oddly enough this doesn't warn:

#include <immintrin.h>

int __attribute__((target("avx"))) foo(__m256i a) {
  return _mm256_extract_epi32(a, 3);
}
extern  __m256i a;
int bar() {
  return foo(a);
}

whereas I'd think it probably should. Essentially gcc only warns in the
case where we'd call a builtin that required certain target specific code
generation, however, as pr24335 shows, that's not necessarily good enough
for us. (It might be? Should those just be code generation bugs? I mean,
theoretically we should be able to generate code for them right?)

So, taking opinions here. Honestly warning in the second case is probably
much easier than warning in the first even if it will take a bit of
refactoring. Warning in the first would require yet another field to the
builtins on required ISA level perhaps? Or maybe something else.

So, thoughts?

-eric


> >
>> >     2. We could support some automatic transfer of the target attribute
>> to the
>> >     caller when calling these intrinsics, but I worry that this is too
>> >     confusing.
>> >
>> > We could, but it's probably better to leave it as is.
>> >
>> > -eric
>> >
>> >
>> >     Implicitly setting a target attribute may block inlining that the
>> user
>> >     expected to happen, for example. Alternatively, there may be a
>> dynamic
>> >     cpuid check in the same function between SSE2 and AVX variants of
>> the same
>> >     algorithm, and now the SSE2 loop will unexpectedly use AVX
>> instructions.
>> >
>> >     So we should probably settle with telling the user to add -msseNN or
>> >     __atribute__((target(("sseNN")))).
>> >
>> >     On Thu, Jul 30, 2015 at 8:53 AM, Vedant Kumar <vsk at apple.com>
>> wrote:
>> >
>> >         I've run into some code which no longer compiles because of two
>> recent
>> >         changes:
>> >
>> >           41885d3 Update the intel intrinsic headers to use the target
>> >         attribute support.
>> >           695aff1 Use a define for per-file function attributes for the
>> Intel
>> >         intrinsic headers.
>> >
>> >         Specifically, one project defines its own SSE4.1 emulation
>> routines
>> >         when the real intrinsics aren't available. This is a problem
>> because
>> >         they've reused the names of the intrinsics. E.g;
>> >
>> >         > #ifndef __SSE4_1__
>> >         > #define _mm_extract_epi8(a_, ndx) ({ ... })
>> >         > static inline __m128i _mm_blendv_epi8(__m128i a, __m128i b,
>> __m128i
>> >         mask) { ... }
>> >         > ...
>> >         > #endif
>> >
>> >         SSE4.1 intrinsics now leak into the project when it's being
>> compiled
>> >         for targets without SSE4.1 support. Compilation fails with
>> "error:
>> >         redefinition ...".
>> >
>> >         When these changes were initially being discussed, I think our
>> stance
>> >         was that we shouldn't support code like this [1]. However, we
>> should
>> >         reconsider for the sake of avoiding breakage. AFAICT, we would
>> need to
>> >         revert just two types of changes:
>> >
>> >         In lib/Headers/__wmmintrin_aes.h:
>> >
>> >         > -#if defined (__SSE4_2__) || defined (__SSE4_1__)
>> >         >  #include <smmintrin.h>
>> >         > -#endif
>> >
>> >         In lib/Headers/smmintrin.h:
>> >
>> >         > -#ifndef __SSE4_1__
>> >         > -#error "SSE4.1 instruction set not enabled"
>> >         > -#else
>> >
>> >         I don't see any downsides to reintroducing these guards. If
>> everyone's
>> >         OK with this, I can mail a patch in. The alternative is to have
>> >         clients rewrite their emulation layers like this:
>> >
>> >         > #ifdef __SSE4_1__
>> >         > #define compat_mm_extract_epi8 _mm_extract_epi8
>> >         > static inline __m128i combat_mm_blendv_epi8(__m128i a,
>> __m128i b,
>> >         __m128i mask) __attribute__((__target__(("sse4.1")))) {
>> >         >   return _mm_blendv_epi8(a, b, mask);
>> >         > }
>> >         > ...
>> >         > #else /* OK, no native SSE 4.1. Define our own. */
>> >         > #define compat_mm_extract_epi8(a_, ndx) ({ ... })
>> >         > static inline __m128i compat_mm_blendv_epi8(__m128i a,
>> __m128i b,
>> >         __m128i mask) { ... }
>> >         > ...
>> >         > #endif
>> >
>> >         ... and then replace all calls to intrinsics with calls to the
>> new
>> >         compatibility routines. This seems like a lot of tedious work,
>> and I'd
>> >         love to help people avoid it :).
>> >
>> >         Let me know what you think!
>> >
>> >         vedant
>> >
>> >         [1] http://lists.cs.uiuc.edu/pipermail/cfe-commits/
>> >         Week-of-Mon-20150615/131192.html
>> >         _______________________________________________
>> >         cfe-dev mailing list
>> >         cfe-dev at cs.uiuc.edu
>> >         http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>> >
>> > _______________________________________________
>> > cfe-dev mailing list
>> > cfe-dev at cs.uiuc.edu
>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150803/49b49001/attachment.html>


More information about the cfe-dev mailing list