[cfe-commits] [PATCH] atomic operation builtins, part 1

Fri Oct 14 15:02:07 PDT 2011

Resending due to list auto-discard.

On 10/12/11, Lawrence Crowl <crowl at google.com> wrote:
> On 10/12/11, John McCall <rjmccall at apple.com> wrote:
>> On Oct 12, 2011, at 2:55 PM, Jeffrey Yasskin wrote:
>> > On Oct 12, 2011 John McCall <rjmccall at apple.com> wrote:
>> > > How aggressive are you suggesting we be about this?  If I
>> > > make this type atomic:
>> > >
>> > > struct { float values[5]; };
>> > >
>> > > do we really increase its size and alignment up to 32 bytes
>> > > in the wild hope that the architecture will add 32-byte
>> > > atomics someday?  If so, what's the limit?  If not, why is
>> > > 16 the limit?
>> >
>> > The goal was that architectures could add new atomic
>> > instructions without forcing an ABI change. Changing the size of
>> > atomic<FiveFloats> would be an ABI change, so we should try to
>> > plan ahead to avoid it.  All the existing atomics have required
>> > alignments equal to their sizes, and whole-cacheline cmpxchg
>> > seems like a plausible future instruction and would also require
>> > alignment equal to the size, so that's what I've been suggesting.
>> >
>> > I suspect users won't use really large types with atomic<T>
>> > simply because every access requires a copy of the whole
>> > object. And when they switch to explicitly locked data, that'll
>> > avoid wasted space from the extra alignment on atomic types.
>>
>> So you want to pessimize code using large atomics just in case
>> it's eventually deployed to a theoretical future processor which
>> has burnt hardware on a use case that you've just stipulated that
>> you can't imagine a use for.
>
> What is the problem with pessimizing code that doesn't currently
> have a use?
>
>> I think anyone who urgently needs to take advantage of 64-byte
>> atomics can opt into a lock-free implementation of them by
>> requesting a variant ABI.
>
> Variant ABIs are expensive.  The x32 psABI has obvious and broad
> performance benefits, and yet it is only now getting any traction,
> near 10 years after AMD64.  I have confidence that there will be
> no widely used variant ABI for large atomics.
>
>> The practical ABI constraints on atomics are actually fairly weak:
>> generally you will not see a large number of translation units
>> accessing the same atomic object, simply because well-written
>> threaded code demands to be fairly self-contained.
>
> History has shown that the ABI propogates into all kinds of corners.
> It only takes one ABI incompatibility in a thousand source files
> to doom the program to incompatibility.
>
>> > I wouldn't really mind having the compiler produce an error for
>> > types it can't make lock-free. Then users can't use atomics of
>> > a size that would need an external library.
>>
>> Okay, but that's obviously not a real option.  Feel free to
>> implement and pass -Werror=locked-atomics.
>
> Agreed.
>
>> > If we do allow users to use larger atomics, then I don't think
>> > it's a good idea to guarantee the need for an ABI change when
>> > processors increase the maximum atomic size.
>>
>> Suddenly introducing a need for huge alignments into environments
>> that don't support them is much more worrying to me than a small
>> number of programmers needing to manually request an optimization
>> on weird code.
>
> Atomics often tend to be used in environments where performance
> is critical.  It may be wierd code, and rare code, but it's the
> code that can make or break system performance.
>
> As for alignment support, the C12 (presumably) standard provides
> facilities for super-aligned allocation.  Systems will need to
> provide that support independent of atomics.
>
>> > On Oct 12, 2011 John McCall <rjmccall at apple.com> wrote:
>> > > On Oct 12, 2011, at 1:08 PM, Andrew MacLeod wrote:
>> > > > If the padding is under control of the 'atomic' keyword
>> > > > for the type, then we have complete control over those
>> > > > padding bytes. Regardless of what junk might be in them,
>> > > > they are part of the atomic data structure and the only
>> > > > way to access them is through a full atomic access. Its
>> > > > like adding another user field to the structure and not
>> > > > setting it. It shouldn't cause spurious failures.
>> > >
>> > > Well, we at least have to make sure that our atomic operations
>> > > always zero-pad their operands.  For example, if we do an
>> > > atomic store into a 5-byte struct that we've padded to 8 bytes,
>> > > we have to make sure we store a zero pattern into the pad.
>> > > That's feasible, but it's complexity that we should at least
>> > > acknowledge before committing to it.
>> > >
>> > > I can also see this exposing lots of what are, admittedly,
>> > > source bugs, like only zero'ing the first sizeof(T) bytes of
>> > > an _Atomic(T).
>> >
>> > You're thinking of code that tries to initialize an _Atomic(T)
>> > with memset(0, &at, sizeof(T))? I think code's only supposed
>> > to initialize _Atomic types with ATOMIC_VAR_INIT(value) or
>> > atomic_init(&at, value), so the source bug should be in the
>> > memset()'s pointer argument in addition to its size.
>>
>> Interesting.  Nonetheless.
>
> --
> Lawrence Crowl
>

-- 
Lawrence Crowl