[cfe-dev] Clang builtins for C++20 STL features

Tue Dec 11 17:37:20 PST 2018

FWIW, I've always thought that the simplest solution is just to expose
something like "std::is_trivially_equality_comparable_v<T>" (which would be
true whenever comparing objects of type T for equality could be done as-if
by memcpy), and then constrain "std::atomic<T>::compare_exchange_foo" to be
provided iff "std::is_trivially_equality_comparable_v<T>".  P0528 was
already accepted and merged by the time I started talking about this,
though.
"std::is_trivially_equality_comparable_v<T>" would be similar, but IIRC not
100% identical, to the existing
"std::has_unique_object_representations_v<T>".  As of C++2a, it is
detectable by the compiler even for class types (as long as the programmer
=default's their type's operator==).

Anyway, I don't think programmers have any business trying to
compare-and-swap structs with padding bits. (Or floats.)

–Arthur

On Tue, Dec 11, 2018 at 6:51 PM James Y Knight via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> After considering this a bit, I'm not sure the __builtin_clear_padding
> proposal is workable. Worse, I'm not sure that the standards proposal
> itself is workable, either.
>
> 1. Regarding general implementability of P0528:
>
> Given a call to
>   bool atomic_compare_exchange_strong(std::atomic<T>* obj, T* expected, T
> desired)
> we need to first, compare OBJ to EXPECTED -- ignoring padding. Then:
> a. if they're equal, store DESIRED into the bytes pointed to by OBJ.
> b. if they're not equal, copy the bytes pointed to by OBJ into the bytes
> pointed to by EXPECTED.
>
> The problem is: how to do a comparison ignoring padding? We cannot
> reasonably modify the comparison operation itself (it's either a hardware
> instruction, or else a library routine taking only a size). The obvious
> solution is to clear the padding on both sides of the comparison (thus
> __builtin_clear_padding).
>
> We can clear the padding in EXPECTED, although we'll need to make a copy
> first, as we're not allowed to store into it unless the compare-exchange
> fails. But, we cannot modify OBJ except as part of the "exchange", when the
> comparison has already succeeded.
>
> If we can ensure that the existing value in OBJ has always had its padding
> cleared, we're okay. We can be fully in-control of all stores that occur
> into a std::atomic, and, I believe that's true for C11 _Atomic types as
> well. (it's unclear to me whether using memcpy to non-atomically overwrite
> a C11 "_Atomic struct Foo" is allowable, but I'll assume that it is not.)
>
> However, it seems infeasible to make this guarantee for std::atomic_ref,
> as that can be created pointing to any existing object. Said existing
> object may have its padding in a non-zero state. Of course, <
> http://wg21.link/P0528r3> doesn't mention atomic_ref, but the atomic_ref
> paper, <http://wg21.link/P0019r8> mentions this concern, and simply
> states "We require that the resolution of padding bits follow [P0528r2]."
>
> It's also infeasible to ensure for the GCC __atomic_compare_exchange
> builtin, as it also operates on arbitrary existing objects. (Of course,
> this is irrelevant to the C++ standard.)
>
> I believe a workable alternative would be to generate an extra load of OBJ
> into a temporary memory, copy only the non-padding bytes of EXPECTED on top
> of that, and then do the compare_exchange operation of OBJ with the
> temporary. Finally, copy the temporary back into EXPECTED upon a fail
> result. While that seems like it could work, requiring an extra load before
> compare-exchange operations is rather poor.
>
> Which leads into...
>
> 2. Do we really need to make this change?
>
> The initial paper gives a sample program in http://wg21.link/P0528r0
> asserting that it might infinite loop. That would be quite unfortunate.
>
> Repeating the example here, it's basically:
> ```
>   T desired = ...;
>   T expected;
>   while (
>     !atomic->compare_exchange_strong(
>       expected,
>       desired // Padding bits added and removed here
>   ));
> ```
>
> However, that padding bits in DESIRED may be modified is not super
> problematic. As long as padding bits in EXPECTED (passed by reference) are
> not spuriously-modified, we should be fine.
>
> That is, in practice, I believe the loop should execute at most twice, not
> infinitely. Upon the first failure, all the bits (including padding)
> pointed to by ATOMIC are copied into the object pointed to by EXPECTED (a
> reference parameter). At the next iteration of the loop, the call will
> compare all the bits, including the padding, which was copied as-is from
> OBJ. As nothing writes to EXPECTED between the two calls, the padding bits
> will then remain unchanged, and therefore the comparison will succeed on
> the second iteration.
>
> However, Richard Smith informs me that while the above is true in Clang,
> apparently in theory it is broken -- nothing in the C++ standard prohibits
> padding bits of an object from be discarded or modified arbitrarily, at any
> time! That is, according to the spec (but not Clang), the below function
> "foo" could arbitrarily return non-zero, randomly! (I used "my_mem*" simply
> for demonstrating that even if memcpy/memcmp calls themselves are simply
> function calls and not specially-handled, the problem remains.). One could
> imagine this occurring in a non-pathologically-evil compiler, if it were to
> copy EXPECTED into registers and then back into memory between the two
> calls -- and while doing so, only copies the non-padding parts of the
> object, for efficiency.
>
> ```
> struct Foo { char x; /* padding exists here */ long y; };
>
> int foo(char *mem) {
>   Foo expected;
>   my_memcpy((char*)&expected, mem, sizeof(Foo));
>   return my_memcmp((char*)&expected, mem, sizeof(Foo));
> }
> ```
>
> The C standard does appear to prohibit this behavior (C11 6.2.6.1/6),
> specifying that the padding bytes get set to unspecified values only upon
> writes to the object.
>
> In summary: in C++, it would appear that the motivating example can in
> theory infinite-loop, but in Clang, and the C standard, it'd seem that it
> cannot.
>
> Changing C++ to require the same semantics for padding stability as C
> could be a reasonable alternative resolution to the original issue,
> especially if no implementations actually make use of this allowance today.
> You may still have a single spurious failure, but it would never
> infinite-loop.
>
>
> 3. Finally, if the decision is to go forward with P0528 anyways, and if
> padding bits *can* arbitrarily change, then `__builtin_clear_padding(T*
> ptr)` is not a theoretically sound interface, because the padding bits
> could change again, after the call to clear them returns.
>
> We'd need to either implement the support within the atomic builtins
> themselves (which, as noted, cannot be done within the existing GCC
> builtins), or define some alternative interface.
>
>
> On Fri, Nov 30, 2018 at 4:13 PM JF Bastien <jfbastien at apple.com> wrote:
>
>>
>>
>> On Nov 30, 2018, at 1:10 PM, James Y Knight <jyknight at google.com> wrote:
>>
>>
>>
>> On Fri, Nov 30, 2018 at 4:06 PM JF Bastien <jfbastien at apple.com> wrote:
>>
>>>
>>>
>>> On Nov 30, 2018, at 12:39 PM, Casey Carter via cfe-dev <
>>> cfe-dev at lists.llvm.org> wrote:
>>>
>>> On Fri, Nov 30, 2018 at 12:35 PM James Y Knight via cfe-dev <
>>> cfe-dev at lists.llvm.org> wrote:
>>>
>>>> On Thu, Nov 29, 2018, 9:53 PM Richard Smith via cfe-dev <
>>>> cfe-dev at lists.llvm.org wrote:
>>>>
>>>>> * P0528R3 Atomic Compare-And-Exchange With Padding Bits
>>>>>> We need compiler magic here, in some form. Billy O'Neal wrote to the
>>>>>> C1XX team: "To implement the new atomic_ref as well as the change to
>>>>>> compare the value representation of atomics only, the library needs a way
>>>>>> to zero out the padding in arbitrary T, which we can't discover with
>>>>>> library tech alone. We would like an intrinsic that accepts a
>>>>>> trivially-copyable T and produces a copy with the padding zeroed, or takes
>>>>>> a T* and zeros the padding inside that T, or similar."
>>>>>>
>>>>>
>>>>> I think this should be done in-place in memory; producing a copy has
>>>>> the problem that you're passing around a value of type T, and that might
>>>>> permit the padding bits to become undefined again.
>>>>>
>>>>> void __builtin_clear_padding(T *ptr)
>>>>> Effects: Set to zero all bits that are padding bits in the
>>>>> representation of every value of type T.
>>>>>
>>>>
>>>> Is the intent here that every value stored into a std::atomic (e.g. via
>>>> store, exchange, or compare_exchange) would be passed through
>>>> __builtin_clear_padding first, before being stored into the atomic object?
>>>> And presumably the same would need to occur implicitly for C-style `_Atomic
>>>> T`?
>>>>
>>>
>>> Yes; see P0582R3 "The Curious Case of Padding Bits" (
>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0528r3.html)
>>> for details. The intent is to make compare-exchange work for types with
>>> padding bits by zeroing those padding bits before storing a value in an
>>> atomic or comparing with the value stored in an atomic.
>>>
>>>
>>> I think you rather want to zero out padding bits in the atomic, not in
>>> the value, and you want to do it after storing into the atomic (unless
>>> you’re doing zeroing, and then copying the non-padding elements one at a
>>> time, in which case you could just memset instead of having a builtin).
>>>
>>
>> Errr...certainly you can't do additional stores into an atomic after
>> storing a new value into it, that would kinda ruin the "atomic" part.
>>
>>
>> Haha yeah you’re totally right! Ignore me, no idea what I was thinking.
>>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20181211/6cfaece2/attachment.html>