[cfe-commits] [PATCH] atomic operation builtins, part 1
crowl at google.com
Fri Oct 14 15:03:56 PDT 2011
Resending due to list auto-discard.
On 10/13/11, Lawrence Crowl <crowl at google.com> wrote:
> On 10/12/11, John McCall <rjmccall at apple.com> wrote:
>> On Oct 12, 2011, at 7:37 PM, Lawrence Crowl wrote:
>> > Applications drive architecture extension, not the reverse.
>> > What applications will want to use larger atomics now that there
>> > are languages for specifying them? We have obvious need for atomic
>> > types that are twice the size of a pointer. We do not presently
>> > have a compelling case to implement even larger types efficiently.
>> > We may have that evidence in the future. However, given ABI
>> > compatibility constraints, we must choose forever very soon.
>> Yes, I understand. I think you and Jeffrey Yasskin are approaching
>> this from the perspective that nobody will ever actually use these
>> types when they're not lockless, and I am assuming that of course
>> some people will, because they're attracted by the word "atomic",
>> and they'll do so with full knowledge that a cheap lock might be
>> involved, and those people will be very surprised to find out that
>> the implementation is substantially less efficient than possible
>> simply because we wanted to future-proof against a grossly
>> improbable outcome.
> We are weighing the probability of various events differently.
>> > > We're not talking about a radically different ABI. We're talking
>> > > about basically -msse.
>> > No, we are not. The -msse flag does not affect the layout of doubles
>> > in memory. How successful would -msse have been if it required
>> > changing the size of a double and recompiling every library that
>> > went into the program? Every device driver? Not very.
>> You're right; my example was poorly chosen. Of course -msse is
>> purely local to a translation unit.
>> Across GCC and Clang, there are dozens of examples of
>> command-line options which do impact the ABI, all of them much
>> more invasively than this. If I were designing this properly, though,
>> I would suggest some sort of attribute((lockless)) to opt in a specific
>> object to a variant lockless implementation.
> True, but such facilities tend to get used only infrequently,
> so the performance of the defaults matter (both ways).
>> > > Bloating structures to meet 64-byte alignment requirements also
>> > > has a real impact on system performance.
>> > True enough. However, I expect the performance tradeoff to weigh
>> > in favor of reducing synchronization costs over saving space for
>> > atomics.
>> I'm not sure I would agree even if I saw any likelihood whatsoever
>> of these performance gains actually being realized. We live in an
>> era where memory is precious and CPU cycles are cheap, and
>> that doesn't seem to be changing.
> Actually, the memory itself is cheap. What is expensive is
> the memory cycles. Atomics consume those as well as regular
> memory accesses. The problem we have here is that the number of
> memory cycles spent on contention can be very high. Unfortunately,
> that cost tends not to show up in any static analysis of the code.
> My working assumption is that programmers will move from locks to
> atomics when they have a contention/performance problem. I would
> like to have some performance there for them to get!
> Lawrence Crowl
More information about the cfe-commits