[cfe-commits] [PATCH] atomic operation builtins, part 1

Lawrence Crowl crowl at google.com
Mon Feb 13 17:36:59 PST 2012

On 2/11/12, Jeffrey Yasskin <jyasskin at googlers.com> wrote:
> On Wed, Oct 12, 2011 at 11:55 AM, Jeffrey Yasskin <jyasskin at google.com> wrote:
>> [+ Lawrence who's been driving the ABI-compatibility design. Context
>> at
>> http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20111010/047614.html]
>> On Wed, Oct 12, 2011 at 10:57 AM, John McCall <rjmccall at apple.com> wrote:
>>> On Oct 12, 2011, at 9:03 AM, Jeffrey Yasskin wrote:
>>>> On Wed, Oct 12, 2011 at 6:31 AM, Andrew MacLeod <amacleod at redhat.com>
>>>> wrote:
>>>>> - language atomic types up to 16 bytes should be padded to an
>>>>> appropriate
>>>>> size, and aligned properly.
>>>>> - if memory matching one of the 5 'optimized' sizes isn't aligned
>>>>> properly,
>>>>> results are undefined.
>>>>> - if the size does not match one of the 5 specific routines, then the
>>>>> library generic ABI can handle it.  There's no alignment guarantees, so
>>>>> I
>>>>> presume it would end up being a locked implementation using hash tables
>>>>> and
>>>>> addresses or something.
>>>> The ABI library needs to demand alignment guarantees, or have them
>>>> passed in, or it won't be able to support larger lock-free sizes on
>>>> new architectures.
>>> How aggressive are you suggesting we be about this?  If I make this type
>>> atomic:
>>>  struct { float values[5]; };
>>> do we really increase its size and alignment up to 32 bytes in the wild
>>> hope that the architecture will add 32-byte atomics someday?  If so,
>>> what's the limit?  If not, why is 16 the limit?
>> The goal was that architectures could add new atomic instructions
>> without forcing an ABI change. Changing the size of atomic<FiveFloats>
>> would be an ABI change, so we should try to plan ahead to avoid it.
>> All the existing atomics have required alignments equal to their
>> sizes, and whole-cacheline cmpxchg seems like a plausible future
>> instruction and would also require alignment equal to the size, so
>> that's what I've been suggesting.
> I think the recent announcement at
> http://software.intel.com/en-us/blogs/2012/02/07/transactional-synchronization-in-haswell/,
> that Intel plans to implement hardware transactions by making locked
> regions cheaper, undermines my and Lawrence's position here. If these
> new instructions work like they appear to, it'll be possible to
> implement types with arbitrary sizes and alignments as cheaply as the
> current lock-free operations, and it seems unlikely to me that Intel
> would add larger lock-free operations once they have these
> transactional instructions.

My guess is that they are exploiting cache line ownership.  I expect
there is a limit on the number of lines, but not small enough to
affect 'reasonable' atomic types.

Crossing a cache boundary will require holding both lines.  If there
is any false sharing on those lines, the performance could suffer
badly.  One advantage to super-aligning is that the probability of
false sharing goes down.

I suppose we could pass that problem back to the user, which in
general they must deal with anyway.  However, there is presently
no C++ standard mechanism to respect cache line size and alignment.
Forcing a bunch of platform-dependent code to address the performance
doesn't seem like a good thing to do.  Standardizing cache line
size queries seems like a good way to unproductively spend lots of
committee time.  Grumble.

Lawrence Crowl

More information about the cfe-commits mailing list