[cfe-dev] Clang builtins for C++20 STL features

Wed May 1 18:05:52 PDT 2019

On Sat, 27 Apr 2019 at 05:42, Casey Carter <cartec69 at gmail.com> wrote:
> As a heads up: we will likely be shipping __builtin_u8memchr(), __builtin_u8memcmp(), and __builtin_u8strlen() in Visual Studio 2019 16.2.It would be nice if char_traits<char8_t> could use the intrinsics on clang instead of falling back to the suboptimal hand-rolled loops.

With what types? Does u8memchr take an int for its second argument
like memchr, or does it take a char8_t like wmemchr takes a wchar_t?

> On Thu, Nov 29, 2018 at 7:52 PM Stephan T. Lavavej via cfe-dev <cfe-dev at lists.llvm.org> wrote:
>>
>> > Approximately following the naming convention of wcscmp etc, maybe __builtin_u8scmp, __builtin_u8slen, __builtin_u8scpy, ...?
>>
>>
>>
>> Actually, memcmp and memchr are non-null-terminated while strlen is null-terminated, so the names should reflect that. Existing names:
>>
>>
>>
>> __builtin_memcmp()
>>
>> __builtin_strlen()
>>
>> __builtin_char_memchr()
>>
>>
>>
>> __builtin_wmemcmp()
>>
>> __builtin_wcslen()
>>
>> __builtin_wmemchr()
>>
>>
>>
>> How about the following for char8_t, plus char16_t/char32_t if those are implemented? (This differs from my suggestion in LLVM#35165.)
>>
>>
>>
>> __builtin_u8memcmp()
>>
>> __builtin_u8strlen()
>>
>> __builtin_u8memchr()
>>
>>
>>
>> __builtin_u16memcmp()
>>
>> __builtin_u16strlen()
>>
>> __builtin_u16memchr()
>>
>>
>>
>> __builtin_u32memcmp()
>>
>> __builtin_u32strlen()
>>
>> __builtin_u32memchr()
>>
>>
>>
>> Thanks,
>>
>> STL
>>
>>
>>
>> From: Stephan T. Lavavej
>> Sent: Thursday, November 29, 2018 7:22 PM
>> To: 'Richard Smith' <richard at metafoo.co.uk>; 'Erik Pilkington' <erik.pilkington at gmail.com>
>> Cc: Clang Dev <cfe-dev at lists.llvm.org>
>> Subject: RE: [cfe-dev] Clang builtins for C++20 STL features
>>
>>
>>
>> [Erik Pilkington]
>>
>> > Not formally, but I'm working on a patch that uses the __builtin_bit_cast(To, value) spelling, so lets just go with that.
>>
>>
>>
>> [Richard Smith]
>>
>> > (I need to keep reminding myself: this can't be __builtin_bit_cast(&dest, &src) because the To type might not be default-constructible.)
>>
>> > Unless someone wants to provide a counterargument, let's go with __builtin_bit_cast(To, value).
>>
>> > constexpr T __builtin_bit_cast(typename T, const U &src)
>>
>> > Effects: Bit-cast the value of src to type T. Ill-formed if T and U are of different sizes. Only guaranteed to be usable in constant expressions in the conditions specified for std::bit_cast.
>>
>>
>>
>> Great, thanks!
>>
>>
>>
>> > For us (and I'd guess for GCC), __builtin_is_constant_evaluated() would be the most natural choice.
>>
>>
>>
>> Got it.
>>
>>
>>
>> > These seem low-priority given std::is_constant_evaluated(), but I think it might still be nice to have the builtins even if you don't formally need them. Our __builtin_mem* and __builtin_str* are a lot faster to evaluate than the equivalent hand-rolled C++ code would be.
>>
>>
>>
>> Ah, throughput is an excellent reason. (I can already imagine generated code stressing constexpr char_traits<char8_t> with enormous strings.)
>>
>>
>>
>> > (Clang has a __has_builtin builtin macro to allow these to be detected and used if available. Does MSVC have anything similar?)
>>
>>
>>
>> Not at the moment.
>>
>>
>>
>> > Approximately following the naming convention of wcscmp etc, maybe __builtin_u8scmp, __builtin_u8slen, __builtin_u8scpy, ...?
>>
>>
>>
>> Nice and systematic.
>>
>>
>>
>> > (Should we also add __builtin_u16s* and __builtin_u32s* while we're here?)
>>
>>
>>
>> That's https://bugs.llvm.org/show_bug.cgi?id=35165 "Consider providing string builtins for char16_t" which I filed last year.
>>
>>
>>
>> > I think this should be done in-place in memory; producing a copy has the problem that you're passing around a value of type T, and that might permit the padding bits to become undefined again.
>>
>> > void __builtin_clear_padding(T *ptr)
>>
>> > Effects: Set to zero all bits that are padding bits in the representation of every value of type T.
>>
>>
>>
>> Great, I agree with the rationale.
>>
>>
>>
>> > Do we need to allow this to be called in constant expressions?
>>
>>
>>
>> No, because atomic isn't constexpr.
>>
>>
>>
>> > I would generally prefer that we expose traits that exactly match the library requirements with the same name as the library trait with a leading dunder, with an argument list matching the library trait. So:
>>
>> > Add __is_convertible(From, To) and __is_nothrow_convertible(From, To)
>>
>> > Make __is_convertible_to a (deprecated) synonym for __is_convertible.
>>
>>
>>
>> Sounds good (although switching intrinsics is a headache due to getting all compilers updated, we can do it).
>>
>>
>>
>> > Well, Clang already supports __assume() for MSVC compatibility (only in MS mode) and __builtin_assume() (our preferred spelling, available in general). But a general assume intrinsic is probably not the best choice here.
>>
>> > Clang and GCC also already have:
>>
>> > void *__builtin_assume_aligned(const void *p, size_t align, size_t offset_from_aligned = 0)
>>
>> > See https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
>>
>> > That's not ideal because it's not const-correct.
>>
>> > Even for library wording, the spec is weak on specifying when a call to std::assume_aligned is allowed in constant expressions.
>>
>> > For Clang at least, we treat a call to __builtin_assume_aligned as non-constant if we cannot prove the object is suitably aligned
>>
>> > (that is, if the complete object's alignment and the offset within it don't result in a suitable alignment) even if we happen to know that (for instance) a global variable will typically be aligned to 16 bytes in practice.
>>
>>
>>
>> I'll mark this as "more design needed", then.
>>
>>
>>
>> > How about:
>>
>> > * invocation type traits from Library Fundamentals V1
>>
>> > * source_location from Library Fundamentals V2
>>
>>
>>
>> Ah, good catches. We aren't implementing LibFun at the moment so I don't need intrinsics for these (C++23 at the earliest). If Clang chooses anything, we'll follow that precedent.
>>
>>
>>
>> I'll send these names/interfaces over to the C1XX team (and I'll ask them to comment here if anything is truly unacceptable, although it all sounds good to me).
>>
>>
>>
>> Thanks!
>>
>> STL
>>
>>
>>
>> From: Richard Smith <richard at metafoo.co.uk>
>> Sent: Thursday, November 29, 2018 6:53 PM
>> To: Stephan T. Lavavej <stl at exchange.microsoft.com>
>> Cc: Clang Dev <cfe-dev at lists.llvm.org>
>> Subject: Re: [cfe-dev] Clang builtins for C++20 STL features
>>
>>
>>
>> On Thu, 29 Nov 2018 at 17:55, Stephan T. Lavavej via cfe-dev <cfe-dev at lists.llvm.org> wrote:
>>
>> Hi Clang devs,
>>
>> WG21 has voted in a bunch of C++20 STL features that need compiler support via builtins/intrinsics. As usual, MSVC's STL would like to use identically-named builtins for Clang, C1XX, and EDG, so I wanted to ask if you've chosen any names (and interfaces) yet. Also as usual, I have utterly no opinion about naming - any name that gets the compiler to do my work for me is amazing (as long as all compilers are consistent). :-)
>>
>> * P0595R2 std::is_constant_evaluated()
>> Should this be __is_constant_evaluated() or __builtin_is_constant_evaluated() or something else?
>>
>>
>>
>> For us (and I'd guess for GCC), __builtin_is_constant_evaluated() would be the most natural choice.
>>
>>
>>
>> * P0482R6 char8_t
>> Given std::is_constant_evaluated(), we might not need anything new here. Otherwise, should there be analogues of __builtin_memcmp(), __builtin_strlen(), and __builtin_char_memchr() for constexpr char_traits<char8_t>?
>>
>>
>>
>> These seem low-priority given std::is_constant_evaluated(), but I think it might still be nice to have the builtins even if you don't formally need them. Our __builtin_mem* and __builtin_str* are a lot faster to evaluate than the equivalent hand-rolled C++ code would be. (Clang has a __has_builtin builtin macro to allow these to be detected and used if available. Does MSVC have anything similar?)
>>
>>
>>
>> Approximately following the naming convention of wcscmp etc, maybe __builtin_u8scmp, __builtin_u8slen, __builtin_u8scpy, ...? (Should we also add __builtin_u16s* and __builtin_u32s* while we're here?)
>>
>>
>>
>> * P0476R2 std::bit_cast()
>> This came up a month ago, where Richard Smith suggested __builtin_bit_cast(To, value) or __bit_cast<To>(value), preferring the former (for C friendliness). Was a final name chosen?
>>
>>
>>
>> (I need to keep reminding myself: this can't be __builtin_bit_cast(&dest, &src) because the To type might not be default-constructible.)
>>
>>
>>
>> Unless someone wants to provide a counterargument, let's go with __builtin_bit_cast(To, value).
>>
>>
>>
>> constexpr T __builtin_bit_cast(typename T, const U &src)
>>
>> Effects: Bit-cast the value of src to type T. Ill-formed if T and U are of different sizes. Only guaranteed to be usable in constant expressions in the conditions specified for std::bit_cast.
>>
>>
>>
>> * P0528R3 Atomic Compare-And-Exchange With Padding Bits
>> We need compiler magic here, in some form. Billy O'Neal wrote to the C1XX team: "To implement the new atomic_ref as well as the change to compare the value representation of atomics only, the library needs a way to zero out the padding in arbitrary T, which we can't discover with library tech alone. We would like an intrinsic that accepts a trivially-copyable T and produces a copy with the padding zeroed, or takes a T* and zeros the padding inside that T, or similar."
>>
>>
>>
>> I think this should be done in-place in memory; producing a copy has the problem that you're passing around a value of type T, and that might permit the padding bits to become undefined again.
>>
>>
>>
>> void __builtin_clear_padding(T *ptr)
>>
>> Effects: Set to zero all bits that are padding bits in the representation of every value of type T.
>>
>>
>>
>> Do we need to allow this to be called in constant expressions?
>>
>>
>>
>> * P0758R1 std::is_nothrow_convertible
>> This can be implemented without an intrinsic (std::is_nothrow_invocable_r already demands it; std::is_convertible plus noexcept plus library cleverness works), but an intrinsic is higher throughput (and simpler for third-party libraries that want to imitate the STL without using the STL for whatever reason). MSVC's spelling for the plain trait is __is_convertible_to(From, To); should the new trait be __is_nothrow_convertible_to(From, To) or __is_nothrow_convertible(From, To)?
>>
>>
>>
>> I would generally prefer that we expose traits that exactly match the library requirements with the same name as the library trait with a leading dunder, with an argument list matching the library trait. So:
>>
>> Add __is_convertible(From, To) and __is_nothrow_convertible(From, To)
>>
>> Make __is_convertible_to a (deprecated) synonym for __is_convertible.
>>
>>
>>
>> * P1007R3 std::assume_aligned()
>> MSVC supports a general __assume() although I'm unsure if it's applicable/desirable here. Should there be a dedicated builtin?
>>
>>
>>
>> Well, Clang already supports __assume() for MSVC compatibility (only in MS mode) and __builtin_assume() (our preferred spelling, available in general). But a general assume intrinsic is probably not the best choice here.
>>
>>
>>
>> Clang and GCC also already have:
>>
>>
>>
>> void *__builtin_assume_aligned(const void *p, size_t align, size_t offset_from_aligned = 0)
>>
>> See https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
>>
>>
>>
>> That's not ideal because it's not const-correct.
>>
>>
>>
>> Even for library wording, the spec is weak on specifying when a call to std::assume_aligned is allowed in constant expressions. For Clang at least, we treat a call to __builtin_assume_aligned as non-constant if we cannot prove the object is suitably aligned (that is, if the complete object's alignment and the offset within it don't result in a suitable alignment) even if we happen to know that (for instance) a global variable will typically be aligned to 16 bytes in practice.
>>
>>
>>
>> * I think this list is complete but I might be missing some features.
>>
>>
>>
>> How about:
>>
>> * invocation type traits from Library Fundamentals V1
>>
>> * source_location from Library Fundamentals V2
>>
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev