[llvm-dev] [RFC] Injecting new element atomic memory intrinsics into MemIntrinsic class hierarchy
Philip Reames via llvm-dev
llvm-dev at lists.llvm.org
Thu Aug 17 10:56:28 PDT 2017
I'll comment that Daniel and I have discussed this pretty extensively
offline. I am now convinced that option 1 is our best choice out of a
set of not entirely great choices. If no one has any comments, that's
the approach we'll end up pursuing, but we wanted to give the community
a chance to chime in here. Maybe someone can come up with a better API
design than we've managed. :)
Philip
On 08/17/2017 08:32 AM, Daniel Neilson via llvm-dev wrote:
>
> Hi all,
> We somewhat recently created/updated element-wise unordered-atomic
> versions of the memcpy, memmove, and memset memory intrinsics:
> Memcpy: https://reviews.llvm.org/rL305558
> Memmove: https://reviews.llvm.org/rL307796
> Memset: https://reviews.llvm.org/rL307854
>
> These intrinsics are semantically similar to the regular versions.
> The main difference is that the underlying operation is performed
> entirely with unordered atomic loads & stores of a given size.
>
> Our ultimate purpose is to enable exploitation of the memory
> intrinsic idioms and optimizations in languages like Java that make
> extensive use of unordered-atomic loads & stores. To this end, we will
> eventually have to add these new intrinsics to the instruction class
> hierarchy, and teach passes how to deal with them; talking about how
> to do this is the purpose of this RFC.
>
> We have started adding canary tests to some passes, and will continue
> to do so in preparation for adding the element atomic intrinsics to
> the instruction class hierarchy. These canary tests are simply
> placeholders that show that the pass in question currently does
> nothing with the new element atomic memory instrinics, and
> could/should generally start failing once the unordered-atomic memory
> intrinsics are added to the instruction class hierarchy — telling us
> which passes need to be updated. For example:
> https://reviews.llvm.org/rL308247
>
> For adding the new intrinsics to the instruction class hierarchy, the
> plan will be to add them one at a time — add the intrinsic to the
> hierarchy, and flush out all problems before adding the next. We’re
> thinking that it would be best to start with memset, then memcpy, then
> memmove; memset is kind of it’s own thing, and there are some passes
> that change memcpy/memmove into a memset so it would be beneficial to
> have that in place before memcpy/memmove, and there are also passes
> that change memmove into memcpy but I haven’t found any that go the
> other way (I can’t imagine why you’d want to) so it would be good to
> have memcpy squared away before doing memmove.
>
> There are a few options that we have thought of for adding the new
> memory intrinsics to the instruction class hierarchy. We have a
> preference for the first option, but would appreciate
> thoughts/opinions/discussion on the best way to insert the element
> atomic intrinsics into the hierarchy. For background, the relevant
> portion of the current hierarchy looks like:
>
> MemIntrinsic
> * MemSetInst
> * MemTransferInst
> ** MemCpyInst
> ** MemMoveInst
>
> Option 1)
> -------------
> Do not add any new classes to the hierarchy. Instead, teach each
> class to recognize the unordered-atomic variant as belonging to the
> class (ex: element unordered-atomic memcpy is recognized as a
> MemCpyInst, MemTransferInst, and a MemIntrinsic), and add a query
> method like ‘isUnorderedAtomic’ that will tell you whether the
> underlying intrinsic is one of the element atomic variants.
> Alternatively, the method could be a more generic ‘getOrdering()’ that
> would return an AtomicOrdering; though a memory intrinsic that is
> required to be implemented with ordered-atomic instructions strikes me
> as unlikely to ever be desired.
>
> There is precedent for encoding the intrinsic property as a query
> method both within the current memory intrinsic hierarchy itself (the
> isVolatile() method) and the load/store instruction classes.
>
> The way forward with this option would be to:
> (i) add the method to MemInstrinsic (returning false); then
> (ii) update all passes, one at a time, to bail-out if
> isUnorderedAtomic() is true; then
> (iii) add an intrinsic to the class hierarchy; then
> (iv) update passes to exploit isUnorderedAtomic() one at a time
> (v) repeat (iii) & (iv) for each unordered-atomic intrinsic
>
> Pros:
> Significant reduction in copy/paste code over other options; passes
> that use visitors would require some refactoring or some copy/paste to
> mirror behaviour if the unordered-atomic intrinsics are separate classes.
> Lends itself to a clear & simple incremental path-forward; each pass
> can be updated independently of the others which will make reviews
> more achievable due to not requiring people with working knowledge
> much beyond the single pass.
> Easy to leverage new work on memory intrinsic optimization for the
> unordered-atomic variants; even near-zero cost for passes that won’t
> be affected by unordered-atomicity.
> Cons:
> One more attribute to check when implementing a pass that handles a
> memory intrinsic; could lead to some unexpected bugs where we change
> an unordered-atomic intrinsic into a regular intrinsic.
>
> The con here is pretty concerning. We risk making it easy for a
> developer to forget about the element-atomic variants when introducing
> an optimization that converts a memmove/memcpy/memset. Such an
> optimization could, conceivably, result in an unordered-atomic
> intrinsic being erroneously converted to drop its atomicity; such a
> bug would only exhibit as a race condition in the resulting program,
> and could be an absolute beast to debug. The only way to prevent such
> bugs would be very careful code reviews.
>
> Option 2)
> -------------
> Add separate classes to the hierarchy for each intrinsic. There are a
> few different ways that this could be organized (note: UA* =
> unordered-atomic version of intrinsic, to save typing):
>
> a) Completely disjoint hierarchy
> UAMemIntrinsic
> * UAMemSetInst
> * UAMemTransferInst
> ** UAMemCpyInst
> ** UAMemMoveInst
> MemIntrinsic
> * MemSetInst
> * MemTransferInst
> ** MemCpyInst
> ** MemMoveInst
>
> This organization has the benefit of not interacting/interfering with
> the regular memory intrinsics in any way. That’s also a big downside;
> most/many passes are going to be able to function on unordered-atomic
> memory intrinsics the same as they do for the regular memory
> intrinsics. In some cases this will just mean adding another few isa<>
> checks at strategic positions, but in many cases this will mean code
> clones and/or clever refactoring (ex: for passes that incorporate
> visitors).
> Being completely disjoint also makes this option super easy to add in
> incrementally; no pass will know what to do with an unordered-atomic
> intrinsic until explicitly taught. Of course that’s also a downside as
> it means that optimizing the unordered-atomic memory intrinsics always
> has to be explicitly added — nothing comes for free from the regular
> memory intrinsic work.
>
> b) Disjoint sub-hierarchy of MemIntrinsic
> MemIntrinsic
> * MemSetInst
> * UAMemSetInst
> * MemTransferInst
> ** MemCpyInst
> ** MemMoveInst
> * UAMemTransferInst
> ** UAMemCpyInst
> ** UAMemMoveInst
>
> Similar to the (2a) option in a lot of ways, but with some added
> complications for even the regular memory intrinsics. The hierarchy
> immediately below MemIntrinsic becomes much wider, so it is no longer
> sufficient to check isa<MemTransferInst> to narrow down memcpy/memmove
> vs memset. Also, similar to (2a), there’s the potential for a lot of
> code clones in passes that incorporate visitors.
>
> The way forward with this option would be similar to 2a with the
> exception that we’d have to go through the whole codebase to make sure
> that "if (! isa<MemTransferInst>) then Inst is memset” and equivalent
> checks resolve correctly.
>
> c) Child classes of the regular intrinsics
> MemIntrinsic
> * MemSetInst
> ** UAMemSetInst
> * MemTransferInst
> ** MemCpyInst
> *** UAMemCpyInst
> ** MemMoveInst
> *** UAMemMoveInst
>
> There’s not a tonne of difference between this option and option (1),
> above. All of the existing visitors would recognize unordered-atomic
> and regular memory intrinsics, so there wouldn’t need to be any code
> clones to support the new intrinsics.
> However, passes that care about the atomicity of the intrinsic would
> have to add isa<> checks to ensure that they’re not acting on the
> unordered-atomic memory intrinsic when they don’t want to be. It would
> also be trickier than option (1) to check, at the level of
> MemIntrinsic/MemTransferInst, whether the intrinsic we have is
> unordered-atomic; we’d have to explicitly check all of the isa<>’s
> rather than querying the isUnorderedAtomic() property. We could add
> the isUnorderedAtomic() query method to this option, but then it’s not
> much different from option (1).
>
> The way forward for this option would be very similar to that for
> option 1.
>
> d) Direct parent classes of the regular intrinsics
> MemIntrinsic
> * UAMemSetInst
> ** MemSetInst
> * MemTransferInst
> ** UAMemCpyInst
> *** MemCpyInst
> ** UAMemMoveInst
> *** MemMoveInst
>
> This is an idea that came up in our discussions. A benefit is that
> it leaves the existing regular memory intrinsics as leaf classes in
> the hierarchy; so, if we currently check for, say, isa<MemCpyInst>
> then we don’t have to change that code because it can’t be
> unordered-atomic and have that test be true. The benefit is that it
> would make it harder for a pass to accidentally convert an
> unordered-atomic intrinsic into a non-atomic intrinsic. It does
> complicate code that checks whether an intrinsic is a memset by
> checking for not isa<MemTransferInst>, and other code like that.
>
> e) Multiple inheritance
>
> This is a variant on options 2b to 2d. To make it possible to easily
> use isa<> to query for whether or not a memory intrinsic is
> unordered-atomic or not we could add an interface-like class that all
> unordered-atomic memory intrinsic classes would multiply inherit from.
> I’m not sure if there is precedence for this sort of thing in the
> instruction class hierarchy, though; I haven’t reviewed the entire
> hierarchy, but all of the parts that I have looked at appear to be
> single inheritance.
>
> ===
>
> *Note about the align parameter attribute.*
>
> When creating the unordered-atomic memory intrinsics we decided to
> use the align attribute on pointer parameters, rather than supplying a
> single alignment parameter. This adheres to the way that LLVM wants to
> move going forward, but is different from the way that the memcpy,
> memmove, and memset intrinsics are currently implemented. This could,
> potentially, be a complication when integrating the unordered-atomic
> memory intrinsics into the class hierarchy. Ideally, it would be good
> to resurrect the 2.5 yr old work to remove the alignment parameter
> from memcpy, memmove, and memset:
> commit 8b170f7f290843dc3849eaa75b6f74a87a7a2de6
> Author: Pete Cooper <peter_cooper at apple.com
> <mailto:peter_cooper at apple.com>>
> Date: Wed Nov 18 22:17:24 2015 +0000
>
> That was reverted due to bot failures, but the history on exactly what
> those failures are appears, to me, to be lost:
>
> commit 6d024c616a2a4959f8dfe5c64d27f89b394cf042
> Author: Pete Cooper <peter_cooper at apple.com
> <mailto:peter_cooper at apple.com>>
> Date: Thu Nov 19 05:56:52 2015 +0000
>
> Baring that, there is still a way forward for integrating the
> unordered-atomic memory intrinsics into the class hierarchy. The
> getAlignment() method of MemIntrinsic would have to return the minimum
> of the pointer-arg alignment attributes for unordered-atomic
> intrinsics, and the setAlignment() method would have to set both
> parameter attributes to the given value.
>
> ===
>
> Thoughts? Opinions?
>
> Thanks,
> Daniel
>
> ---
> Daniel Neilson, Ph.D.
> Azul Systems
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170817/304e6a02/attachment.html>
More information about the llvm-dev
mailing list