[cfe-commits] [PATCH] atomic operation builtins, part 1

Wed Oct 12 14:54:57 PDT 2011

On 10/12/2011 05:04 PM, John McCall wrote:
>
>
>> The compiler built-ins can always be called directly with arbitrary sized objects, so it would be good for the generic routine to handle it rather than artificially restrict it.
> Are you saying that the generic routine promises co-operation with lock-free atomics when the size parameter is sufficiently small?  That seems unfortunate.  The generic routine is certainly going to be slower than lock-free atomics, but that doesn't mean its performance is unimportant;  a good implementation using striped spin-locks would probably end up on the order of only 2-4 times slower than the lock-free code, so adding a bunch of pre-checks may be quite significant.
>
The rule is that on a given target, a given object size is either A) 
lockfree or B) unknown.

so if the compiler always generates lock free instructions for say 8 
byte values, the library will have to as well.  ie,  __atomic_load_8() 
will also be implemented as lock free. it has to be or things 
break.        The interface from the compiler point of view is that you 
simply call the routine __atomic_load(obj), the presence of 
__atomic_load_8 is transparent to the caller.  Normally the compiler 
will turn __atomic_load() into the direct call,  but I don't see any 
reason to force the compiler do that. It should be free to leave the 
original call to __atomic_load () and that code should co-exist with 
stuff that did call __atomic_load_8.   its free to do whatever it wants 
with sizes that don't exactly match the 5 'optimal' sizes.

It does mean that the library entry point for atomic_load() may start 
with a switch on the size of the object to  jump to the "efficient" 
routines if size matches, but thats about it.  Is that what you are 
concerned about?

>>> Honestly, I don't think future-proofing against arbitrary new atomic instructions really makes any sense.  Even going up to 16 bytes (on architectures where that can't be done lock-free now) worries me a bit.
>>>
>>> Rounding up also worries me, since the user has no control over the padding bytes, but they can still cause spurious failures on, say, compare-and-swap.
>> If the padding is under control of the 'atomic' keyword for the type, then we have complete control over those padding bytes. Regardless of what junk might be in them, they are part of the atomic data structure and the only way to access them is through a full atomic access. Its like adding another user field to the structure and not setting it. It shouldn't cause spurious failures.
> Well, we at least have to make sure that our atomic operations always zero-pad their operands.  For example, if we do an atomic store into a 5-byte struct that we've padded to 8 bytes, we have to make sure we store a zero pattern into the pad.  That's feasible, but it's complexity that we should at least acknowledge before committing to it.
>
> I can also see this exposing lots of what are, admittedly, source bugs, like only zero'ing the first sizeof(T) bytes of an _Atomic(T).

I'm still not sure I see how it matters. Nothing should ever access or 
change that padding field in an unfriendly way since it is always part 
of the atomic word that is load/stored, exchanged or compare_exchanged. 
Those are the only 4 ways to access the memory.  All the fetch_op's only 
operate on full integral values, so thats not a concern.   
Compare_exchange is the only one which it could be affected, and it 
requires that the 'expected' value be from an atomic load or 
exchange...   So I'm not sure how it could be affected.  If someone goes 
off and
  accesses atomic memory on their own, well... you reap what you sow. 
Thats undefined behaviour.

Andrew