[cfe-commits] [PATCH] atomic operation builtins, part 1
crowl at google.com
Fri Oct 14 15:02:54 PDT 2011
Resending due to list auto-discard.
On 10/12/11, Lawrence Crowl <crowl at google.com> wrote:
> On 10/12/11, John McCall <rjmccall at apple.com> wrote:
>> On Oct 12, 2011, at 6:50 PM, Lawrence Crowl wrote:
>> > On 10/12/11, John McCall <rjmccall at apple.com> wrote:
>> > > So you want to pessimize code using large atomics just in case
>> > > it's eventually deployed to a theoretical future processor
>> > > which has burnt hardware on a use case that you've just
>> > > stipulated that you can't imagine a use for.
>> > What is the problem with pessimizing code that doesn't currently
>> > have a use?
>> It's absurd to optimize based on the unsubstantiated idea that
>> future hardware is going to including instructions for something
>> that we don't have a use for. x86-32 doesn't even have 16-byte
>> atomics. In fact, what significant 32-bit architecture does,
>> or even plans to?
> Applications drive architecture extension, not the reverse.
> What applications will want to use larger atomics now that there
> are languages for specifying them? We have obvious need for atomic
> types that are twice the size of a pointer. We do not presently
> have a compelling case to implement even larger types efficiently.
> We may have that evidence in the future. However, given ABI
> compatibility constraints, we must choose forever very soon.
>> > > I think anyone who urgently needs to take advantage of 64-byte
>> > > atomics can opt into a lock-free implementation of them by
>> > > requesting a variant ABI.
>> > Variant ABIs are expensive. The x32 psABI has obvious and
>> > broad performance benefits, and yet it is only now getting any
>> > traction, near 10 years after AMD64. I have confidence that
>> > there will be no widely used variant ABI for large atomics.
>> We're not talking about a radically different ABI. We're talking
>> about basically -msse.
> No, we are not. The -msse flag does not affect the layout of doubles
> in memory. How successful would -msse have been if it required
> changing the size of a double and recompiling every library that
> went into the program? Every device driver? Not very.
> I can implement floating point operations in many different ways
> without affecting the ABI because only the memory layout affects
> the ABI of floats. The issue under discussion is the memory layout
> of atomics. Any decisions made in the near term are likely to
> be irrevokable.
> Atomics are different from floating point in another crucial
> respect. The ABI compatiblity depends not only on the layout,
> but on the synchronization protocol used to access them. If two
> atomic operations are implemented with different protocols,
> the synchronization will fail and the application will fail in
> mysterious ways. I have participated in just such a mistake and
> it was not pretty.
> A consequence of avoiding the mistake is that one needs to have
> the system provide a single implementation of all potential
> non-lock-free atomics. Effectively, that means a dynamic library
> provided to the applications by the system.
>> > > > If we do allow users to use larger atomics, then I don't
>> > > > think it's a good idea to guarantee the need for an ABI
>> > > > change when processors increase the maximum atomic size.
>> > >
>> > > Suddenly introducing a need for huge alignments into
>> > > environments that don't support them is much more worrying
>> > > to me than a small number of programmers needing to manually
>> > > request an optimization on weird code.
>> > Atomics often tend to be used in environments where performance
>> > is critical. It may be wierd code, and rare code, but it's
>> > the code that can make or break system performance.
>> Great. They can use -matomic-cachelines or, better yet,
> They can do that only if they coordinate in advance with other
> users of those same structures.
>> Bloating structures to meet 64-byte alignment requirements also
>> has a real impact on system performance.
> True enough. However, I expect the performance tradeoff to weigh
> in favor of reducing synchronization costs over saving space for
> atomics. After all, one uses atomics because one has a run-time
> problem. If not, mutexes are a safer technology.
> Lawrence Crowl
More information about the cfe-commits