[LLVMdev] Atomic Operation and Synchronization Proposal v2

Chandler Carruth chandlerc at gmail.com
Thu Jul 12 10:59:12 PDT 2007


On 7/12/07, David Greene <dag at cray.com> wrote:
> On Thursday 12 July 2007 07:23, Torvald Riegel wrote:
>
> > > The single instruction constraints can, at their most flexible, constrain
> > > any set of possible pairings of loads from memory and stores to memory
> >
> > I'm not sure about this, but can we get issues due to "special" kinds of
> > data transfers (such as vector stuff, DMA, ...?). Memcpy implementations
> > could be a one thing to look at.
> > This kind of breaks down to how universal you want the memory model to be.
>
> Right.  For example, the Cray X1 has a much richer set of memory ordering
> instructions than anything on the commodity micros:
>
> http://tinyurl.com/3agjjn

Thanks for this link! Very interesting to see an architecture which
pays much more attention to its memory ordering.

> The memory ordering intrinsics in the current llvm proposal can't take
> advantage of them because they are too coarse-grained.

>From what I can clean, this coarseness comes in two flavors -- global
v. local memory access, and type-based granularities. Is this a
correct interpretation? (I'm clearly not going to be an expert on the
X1. ;])

>
> Now, I don't expect we'll see an llvm-based X1 code generator, but looking at
> what the HPC vendors are doing in this area will go a long way toward
> informing the kind of operations we may want to include in llvm.  The trend is
> for vendors to include ever more finely targeted semantics to allow scaling to
> machines with millions of cores.

Absolutely! Like I said, its great to see this kind of information. A
few points about the current proposal:

1) It currently only deals with integers in order to make it simple to
implement, and representable across all architectures. While this is
limiting, I think it remains a good starting point, and shouldn't
cause any problems for later expansion to more type-aware
interpretations.

2) The largest assumption made is that all memory is just "memory".
After that, the most fine grained interpretation of barriers available
was chosen (note that only SPARC can do all the various
combinations... most only use one big fence...). The only major thing
I can see that would increase this granularity is to treat different
types differently, or treat them as going into different parts of
"memory"? Really not sure here, but it definitely is something to look
into. However, I think this may require a much later proposal when the
hardware is actively being used at this level, and we can try and find
a more finegrained way of targetting all the available architectures.
For the time being, it seems that the current proposal hits all the
architectures very neatly.


> If we can incrementally refine the size of the memory ordering hammers, I'm
> ok with that.  If it's simply a matter of adding finer-grained intrinsics
> later, that's cool.  But I don't want to get us into a situation where llvm
> requires stricter memory ordering than is strictly necessary and we can't get
> out from under the stone.

With the current version you can specify exactly what ordering you
desire. The only thing ignored is the type of the various loads and
stores. I think adding that level of granularity to the existing
highly granular pairing selection would be a smooth incremental
update. Is there another update you see needed that would be less
smooth?

Again, thanks for the information on the X1's memory architecture,
very interesting... I'm going to try and get into it a bit more in a
response to Dan Gohman's email below... =]

-Chandler

>
>                                                       -Dave
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>



More information about the llvm-dev mailing list