[llvm-dev] RFC: Element-atomic memory intrinsics
Daniel Neilson via llvm-dev
llvm-dev at lists.llvm.org
Mon May 8 12:08:25 PDT 2017
Hi Sanjoy,
Responses inlined…
> On May 8, 2017, at 12:49 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:
>
> Hi Daniel,
>
> [+CC Mehdi, Vedant for the auto upgrade issue]
>
> On Mon, May 8, 2017 at 7:54 AM, Daniel Neilson via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> **Method**
>>
>> Clearly we are going to have to teach LLVM about unordered memory
>> intrinsics. There are, as I can see it, a couple of ways to accomplish this.
>> I’d like your opinions on which are preferable, or if you can think of any
>> other options. In no particular order…
>>
>> Option 1)
>> Introduce a new unordered/element-atomic version of each of the memory
>> intrinsics.
>> Ex: @llvm.memcpy_element_atomic — work was already started to introduce
>> this one in D27133, but could be backed out and restarted.
>> Intrinsic prototype: @llvm.memcpy_element_atomic.<overload desc>(<ty>*
>> dest, <ty>* src, <ty> len, i32 align, i2 isunordered, i16 element_size)
>> Semantics:
>> * Will do a memcpy of len bytes from src to dest.
>> * len must = k * lcm( #bytes in dest type, #bytes in src type), for
>> some non-negative integer k [note: lcm = least-common multiple]
>> * load/store size given by the constant power-of-2 parameter
>> “element_size”; expected to be the lcm(sizeof(dest_ty), sizeof(src_ty))
>
> I'm not sure if sizeof(dest_ty) and sizeof(src_ty) adds anything here.
>
> LLVM is moving towards "typeless pointers" (i.e. pointers will not
> have pointee types, instead they will just be a "generic pointer" in
> some address space), so working in the types of dest and src into the
> specification seems awkward.
>
Poor choice of wording on my part. By sizeof(<thing>) I mean the element-size of the load/store that appeared in the original loop from which the memory intrinsic was materialized — arguably, these will be the same in any real cases so it probably doesn’t make any sense to even mention it...
> Also, does the non-overlap restriction of src and dest (as in the
> regular llvm.memcpy) apply here as well?
>
Yes, I would think so.
>> * isunordered param: bit 0 == 1 => stores to dest must be marked
>> ‘unordered’; bit 1 == 1 => loads from src must be marked ‘unordered'
>
> What if the bits are zero -- will the stores / loads (depending on
> which bit) be "ordered" in that case, or something stronger?
>
This is partly why I prefer option 2. An ‘isunordered’ value of 0 is nonsense for the standalone atomic-unordered memory intrinsic. It would imply that neither the source nor dest needs to be loaded/stored via unordered-atomic ops, and so the memory intrinsic is identical to the ordinary non-atomic one.
>> Option 2)
>> Expand the current/existing memory intrinsics to identify the unordered
>> constraint, if one exists, in much the same way that volatility is expressed
>> — i.e. add an ‘isunordered’ parameter(s) to the intrinsic.
>> This option has the same semantics as option 1; the only difference is,
>> literally, that we expand the existing memcpy/memset/memmove intrinsics to
>> have an ‘isunordered’ parameter and an ‘element_size’ parameter, so the
>> prototype becomes something like:
>> @llvm.memcpy.<overload desc>(<ty>* dest, <ty>* src, <ty> len, i32 align,
>> i1 isvolatile, i2 isunordered, i16 element_size)
>>
>> Pros:
>> * Minimal extra work to handle the new version in existing passes — only
>> need to change passes that create calls to memory intrinsics, expand memory
>> intrinsics, or that need to care about unordered (which none should that are
>> reasoning about memory intrinsic semantics).
>> * New code that’s introduced by others to exploit/handle memory
>> intrinsics should just handle unordered for free — unordered being a part of
>> the memory intrinsic means it’s more likely that the person will realize
>> that they have to think about it (i.e. it raises the profile of unordered
>> memory intrinsics).
>
> I like the second point, but (unfortunately) I suspect in practice
> you'll see new code do:
>
> if (MCI->isOrdered())
> return false; // be conservative
>
Yes, that would be an unfortunate reality, but one can hope. :-)
>> Cons:
>> * Breaks backward compatibility of the IR — is there a mechanism for
>> migrating old IR into a new IR version when loading the IR into LLVM?
>
> I think the migration here will be fairly straightforward -- you can
> just auto-upgrade old calls to memcpy to pass in 0 for the isordered
> argument. But I've CC'd Mehdi and Vedant to help shed some light on
> this.
>
> — Sanjoy
Thanks!
-Daniel
---
Daniel Neilson, Ph.D.
Azul Systems
More information about the llvm-dev
mailing list