[cfe-dev] __attribute__((enable_if(...))) for de-macroifying	the builtin headers
    Sanjay Patel 
    spatel at rotateright.com
       
    Thu Apr 16 11:10:23 PDT 2015
    
    
  
Hi Sean -
Thanks for pushing on this.
Re: the complicated cases - in one of the recent patch reviews in this
area, Andrea pointed out a major shortcoming of the header hacks: we're
optimizing vector intrinsics into generic IR even at -O0. This won't make
for happy debugging.
So I've been optimizing these in InstCombine rather than headers recently:
http://reviews.llvm.org/rL235124
http://reviews.llvm.org/rL232852
This is a much nicer solution IMO, and I haven't heard any objections to
this approach, so I'm hoping we'll clean up all of the complicated macros
eventually.
On Wed, Apr 15, 2015 at 12:18 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>
> On Wed, Apr 15, 2015 at 6:46 PM, Nick Lewycky <nlewycky at google.com> wrote:
>
>> On 15 April 2015 at 10:27, Sean Silva <chisophugis at gmail.com> wrote:
>>
>>> In the post-commit of r231792, I suggested the idea of using
>>> __attribute__((enable_if(...))) for avoiding the mess of the macros in the
>>> builtin headers. AFAIK, the macros are currently used to make sure that the
>>> "immediates" are constant expressions, piggybacking on the constant
>>> expression check in the __builtin_* call.
>>>
>>> I've attached a file with a proof-of-concept for using
>>> __attribute__((enable_if(...))) for this purpose. I originally though using
>>> __builtin_constant_p in the enable_if, but that turns out to not be
>>> necessary (see the docs:
>>> http://clang.llvm.org/docs/AttributeReference.html#enable-if ; the
>>> enable_if condition fails for non-constant expressions anyway). The core is:
>>>
>>> // Current builtin headers:
>>> //
>>> //#define _mm256_insertf128_si256(V1, V2, M) __extension__ ({ \
>>> //  (__m256i)__builtin_shufflevector( \
>>> //    (__v4di)(V1), \
>>> //    (__v4di)_mm256_castsi128_si256((__m128i)(V2)), \
>>> //    (((M) & 1) ? 0 : 4), \
>>> //    (((M) & 1) ? 1 : 5), \
>>> //    (((M) & 1) ? 4 : 2), \
>>> //    (((M) & 1) ? 5 : 3) );})
>>>
>>> // A bit cleaner.
>>> static __inline __attribute__((__always_inline__, __nodebug__))
>>> __m256i _mm256_insertf128_si256(__m256i __a, __m128i __b, int __imm8)
>>> __attribute__((enable_if(__imm8, "'__imm8' must be a constant")))
>>> {
>>>   if (__imm8 & 1)
>>>     return __builtin_shufflevector(__a, _mm256_castsi128_si256(__b), 0,
>>> 1, 4, 5);
>>>   else
>>>     return __builtin_shufflevector(__a, _mm256_castsi128_si256(__b), 4,
>>> 5, 2, 3);
>>> }
>>>
>>
>> I think:
>>
>> static __inline __attribute__((__always_inline__, __nodebug__))
>> __m256i _mm256_insertf128_si256(__m256i __a, __m128i __b, int __imm8)
>> __attribute__((enable_if(__imm8 & 1 == 1, "'__imm8' must be a constant")))
>> {
>>   return __builtin_shufflevector(__a, _mm256_castsi128_si256(__b), 0, 1,
>> 4, 5);
>> }
>>
>> static __inline __attribute__((__always_inline__, __nodebug__))
>> __m256i _mm256_insertf128_si256(__m256i __a, __m128i __b, int __imm8)
>> __attribute__((enable_if(__imm8 & 1 == 0, "'__imm8' must be a constant")))
>> {
>>   return __builtin_shufflevector(__a, _mm256_castsi128_si256(__b), 4, 5,
>> 2, 3);
>> }
>>
>
> Unfortunately this approach doesn't scale to some of the more nontrivial
> ones like _mm256_alignr_epi8 in http://reviews.llvm.org/D8301 .
>
>>
>> Nick, are you okay using enable_if for this? It's sort of a hack but if
>>> we are going to be carrying this attribute around forever (has it reached
>>> that level of compatibility guarantee yet?), we might as well use it to
>>> solve this problem for us.
>>>
>>
>> Attribute enable_if is here to stay, though it may change meaning in
>> corner cases (notably multiple enable_if's on a single function). Using it
>> in the compiler's own header files is fine. Does anyone ever try to take
>> the address of _mm256_insertf128_si256 (I assume not if it's a macro)?
>>
>
> Yeah, I think taking the address of these is a "don't do that" sort of
> thing. "static inline always_inline"
>
> What changes do you foresee in the case of multiple enable_if's? The
> current behavior (in my empirical testing; haven't used the source) seems
> to be a good fit for this use case.
>
> -- Sean Silva
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150416/1bc2fb5c/attachment.html>
    
    
More information about the cfe-dev
mailing list