[cfe-commits] [PATCH] atomic operation builtins, part 1

Wed Oct 12 14:04:52 PDT 2011

On Oct 12, 2011, at 1:08 PM, Andrew MacLeod wrote:
> On 10/12/2011 01:57 PM, John McCall wrote:
>> On Oct 12, 2011, at 9:03 AM, Jeffrey Yasskin wrote:
>>> On Wed, Oct 12, 2011 at 6:31 AM, Andrew MacLeod<amacleod at redhat.com>  wrote:
>>>> - language atomic types up to 16 bytes should be padded to an appropriate
>>>> size, and aligned properly.
>>>> - if memory matching one of the 5 'optimized' sizes isn't aligned properly,
>>>> results are undefined.
>>>> - if the size does not match one of the 5 specific routines, then the
>>>> library generic ABI can handle it.  There's no alignment guarantees, so I
>>>> presume it would end up being a locked implementation using hash tables and
>>>> addresses or something.
>>> The ABI library needs to demand alignment guarantees, or have them
>>> passed in, or it won't be able to support larger lock-free sizes on
>>> new architectures.
>> How aggressive are you suggesting we be about this?  If I make this type atomic:
>>   struct { float values[5]; };
>> do we really increase its size and alignment up to 32 bytes in the wild hope that the architecture will add 32-byte atomics someday?  If so, what's the limit?  If not, why is 16 the limit?
>> 
> I gather thats the direction he leaning...
> 
> My own opinion is we should probably just allow whatever size is desired, and if it doesn't map to one of the supported lock-free sizes, the the library can do with it whatever it wants.. probably a locked implementation.  I would tend to make the compilers round atomic objects up to 4 and 8, and possibly even 16 since those sizes have a decent chance of being lock free, and avoiding any calls to a library.

To be clear, I completely agree that that should be our strategy.

> I doubt we'll see large arrays of large atomic objects.   (famous last words).

Well, we might;  that's certainly not an uncommon pattern with atomics.  I'm also worried about atomic objects being embedded in structs/classes, with the unexpectedly high-alignment object completely disturbing the packing.

My other worry about high alignments is that a lot of ecosystems do not support them effectively.  malloc and operator new, for example, only promise to return something sufficiently aligned for all fundamental types;  in practice, I think most implementations only guarantee 16-byte alignment, and I'm sure there are some that only provide 4-byte or 8-byte alignment.  I'd also worry about static objects being adequately aligned, although that might just be my paranoia.

> for  objects where it matters, we can probably detect alignment after the fact by looking at the pointer value... you should be able to tell if a 32 byte object pointer is pointing to a 32 byte boundry or not.

Yeah, that's pretty simple.

> The compiler built-ins can always be called directly with arbitrary sized objects, so it would be good for the generic routine to handle it rather than artificially restrict it.

Are you saying that the generic routine promises co-operation with lock-free atomics when the size parameter is sufficiently small?  That seems unfortunate.  The generic routine is certainly going to be slower than lock-free atomics, but that doesn't mean its performance is unimportant;  a good implementation using striped spin-locks would probably end up on the order of only 2-4 times slower than the lock-free code, so adding a bunch of pre-checks may be quite significant.

>> Honestly, I don't think future-proofing against arbitrary new atomic instructions really makes any sense.  Even going up to 16 bytes (on architectures where that can't be done lock-free now) worries me a bit.
>> 
>> Rounding up also worries me, since the user has no control over the padding bytes, but they can still cause spurious failures on, say, compare-and-swap.
> 
> If the padding is under control of the 'atomic' keyword for the type, then we have complete control over those padding bytes. Regardless of what junk might be in them, they are part of the atomic data structure and the only way to access them is through a full atomic access. Its like adding another user field to the structure and not setting it. It shouldn't cause spurious failures.

Well, we at least have to make sure that our atomic operations always zero-pad their operands.  For example, if we do an atomic store into a 5-byte struct that we've padded to 8 bytes, we have to make sure we store a zero pattern into the pad.  That's feasible, but it's complexity that we should at least acknowledge before committing to it.

I can also see this exposing lots of what are, admittedly, source bugs, like only zero'ing the first sizeof(T) bytes of an _Atomic(T).

John.