[cfe-dev] __attribute__((enable_if(...))) for de-macroifying the builtin headers
Richard Smith
richard at metafoo.co.uk
Thu Apr 16 17:05:17 PDT 2015
On Wed, Apr 15, 2015 at 12:08 PM, Sean Silva <chisophugis at gmail.com> wrote:
> On Wed, Apr 15, 2015 at 7:32 PM, Nick Lewycky <nlewycky at google.com> wrote:
>
>> On 15 April 2015 at 11:18, Sean Silva <chisophugis at gmail.com> wrote:
>>
>>>
>>>
>>> On Wed, Apr 15, 2015 at 6:46 PM, Nick Lewycky <nlewycky at google.com>
>>> wrote:
>>>
>>>> On 15 April 2015 at 10:27, Sean Silva <chisophugis at gmail.com> wrote:
>>>>
>>>>> In the post-commit of r231792, I suggested the idea of using
>>>>> __attribute__((enable_if(...))) for avoiding the mess of the macros in the
>>>>> builtin headers. AFAIK, the macros are currently used to make sure that the
>>>>> "immediates" are constant expressions, piggybacking on the constant
>>>>> expression check in the __builtin_* call.
>>>>>
>>>>> I've attached a file with a proof-of-concept for using
>>>>> __attribute__((enable_if(...))) for this purpose. I originally though using
>>>>> __builtin_constant_p in the enable_if, but that turns out to not be
>>>>> necessary (see the docs:
>>>>> http://clang.llvm.org/docs/AttributeReference.html#enable-if ; the
>>>>> enable_if condition fails for non-constant expressions anyway). The core is:
>>>>>
>>>>> // Current builtin headers:
>>>>> //
>>>>> //#define _mm256_insertf128_si256(V1, V2, M) __extension__ ({ \
>>>>> // (__m256i)__builtin_shufflevector( \
>>>>> // (__v4di)(V1), \
>>>>> // (__v4di)_mm256_castsi128_si256((__m128i)(V2)), \
>>>>> // (((M) & 1) ? 0 : 4), \
>>>>> // (((M) & 1) ? 1 : 5), \
>>>>> // (((M) & 1) ? 4 : 2), \
>>>>> // (((M) & 1) ? 5 : 3) );})
>>>>>
>>>>> // A bit cleaner.
>>>>> static __inline __attribute__((__always_inline__, __nodebug__))
>>>>> __m256i _mm256_insertf128_si256(__m256i __a, __m128i __b, int __imm8)
>>>>> __attribute__((enable_if(__imm8, "'__imm8' must be a constant")))
>>>>> {
>>>>> if (__imm8 & 1)
>>>>> return __builtin_shufflevector(__a, _mm256_castsi128_si256(__b),
>>>>> 0, 1, 4, 5);
>>>>> else
>>>>> return __builtin_shufflevector(__a, _mm256_castsi128_si256(__b),
>>>>> 4, 5, 2, 3);
>>>>> }
>>>>>
>>>>
>>>> I think:
>>>>
>>>> static __inline __attribute__((__always_inline__, __nodebug__))
>>>> __m256i _mm256_insertf128_si256(__m256i __a, __m128i __b, int __imm8)
>>>> __attribute__((enable_if(__imm8 & 1 == 1, "'__imm8' must be a
>>>> constant")))
>>>> {
>>>> return __builtin_shufflevector(__a, _mm256_castsi128_si256(__b), 0,
>>>> 1, 4, 5);
>>>> }
>>>>
>>>> static __inline __attribute__((__always_inline__, __nodebug__))
>>>> __m256i _mm256_insertf128_si256(__m256i __a, __m128i __b, int __imm8)
>>>> __attribute__((enable_if(__imm8 & 1 == 0, "'__imm8' must be a
>>>> constant")))
>>>> {
>>>> return __builtin_shufflevector(__a, _mm256_castsi128_si256(__b), 4,
>>>> 5, 2, 3);
>>>> }
>>>>
>>>
>>> Unfortunately this approach doesn't scale to some of the more nontrivial
>>> ones like _mm256_alignr_epi8 in http://reviews.llvm.org/D8301 .
>>>
>>
>> Fair enough. I would still try to hoist out a simple if-statement though,
>> even if we can't do it for all of the builtins.
>>
>
> What is the advantage of hoisting it?
>
1) We won't emit a redundant branch instruction that the middle-end would
always remove
2) More reasonable code at -O0
3) No need for __builtin_constant_p usage (note that your original version
didn't work if __imm8 was 0).
And disadvantages: we'll produce less good diagnostics (extra notes), and
we'll need __attribute__((overloadable)) for this to work in C.
Nick, are you okay using enable_if for this? It's sort of a hack but if we
>>>>> are going to be carrying this attribute around forever (has it reached that
>>>>> level of compatibility guarantee yet?), we might as well use it to solve
>>>>> this problem for us.
>>>>>
>>>>
>>>> Attribute enable_if is here to stay, though it may change meaning in
>>>> corner cases (notably multiple enable_if's on a single function). Using it
>>>> in the compiler's own header files is fine. Does anyone ever try to take
>>>> the address of _mm256_insertf128_si256 (I assume not if it's a macro)?
>>>>
>>>
>>> Yeah, I think taking the address of these is a "don't do that" sort of
>>> thing. "static inline always_inline"
>>>
>>
>> You can still take the address of a static inline always_inline function
>> (at least with GCC 4.7), but I guess we're okay not allowing that.
>>
>
> Yeah, I sent before finishing my sentence, which I think was supposed to
> be something like: "static inline always_inline" reads to me like "we
> really really really don't want this to end up as a symbol in the object
> file", which implies "shouldn't have an address".
>
>
>>
>> What changes do you foresee in the case of multiple enable_if's? The
>>> current behavior (in my empirical testing; haven't used the source) seems
>>> to be a good fit for this use case.
>>>
>>
>> I don't know what we'd change it to, but we currently have a problem
>> where you end up with a combinatorial explosion in number of overloads
>> necessary to represent the check you want. I don't recall the situation off
>> the top of my head.
>>
>>
> Ok, I can see that. I guess we can cross that bridge when we need to.
>
> -- Sean Silva
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150416/a2f3ae70/attachment.html>
More information about the cfe-dev
mailing list