[llvm-dev] Memory scope proposal
Sameer Sahasrabuddhe via llvm-dev
llvm-dev at lists.llvm.org
Fri Oct 7 01:40:29 PDT 2016
On Sat, Sep 3, 2016 at 8:43 AM, Mehdi Amini via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> The key bit here is that I can describe transformations in terms of these
> abstract domains without knowing anything about how the frontend might be
> using such a domain or how the backend might lower it. In particular, if I
> have the sequence:
>
> %v = load i64, %p atomic scope {domain3 only}
>
> fence seq_cst scope={domain1 only}
>
> %v2 = load i64, %p atomic scope {domain3 only}
>
> I can tell that the two loads aren't order with respect to the fence and
> that I can do load forwarding here.
>
>
> I see the current proposal as a strip-down version what you describe: the
> optimizer can reason about operations inside a single scope, but can’t
> assume anything cross-scope (they may or may not interact with each other).
>
> What you describes seems like having always non-overlapping domains (from
> the optimizer point of view), and require the frontend to express the
> overlapping by attaching a “list" of domains that an atomic operation
> interacts with.
>
There is another way to tackle this, and Chandler had hinted at it in an
old thread:
http://lists.llvm.org/pipermail/llvm-dev/2015-January/080236.html
Quoting from Chandler's email:
"Essentially, I think target-independent optimizations are still
attractive, but we might want to just force them to go through an actual
target-implemented API to interpret the scopes rather than making the
interpretation work from first principles. I just worry that the targets
are going to be too different and we may fail to accurately predict future
targets' needs."
Note that in Philip's example above, the optimization is not really asking
whether the two loads are ordered. It is asking whether the second load can
be reordered to occur before the fence. Whatever the question, it can be
implemented as a query to the target as a simple predicate. For example,
"isOrdered(inst1, inst2)" or "canEliminate(store1, store2)". The latter
query is when the optimizer wants to eliminate a store if it is followed by
another store to the same location. The target can interpret the scope in
whatever way and return true/false.
The advantage here is that now the optimizer does not need to know anything
at all about the scopes. For example, in memory models like OpenCL, the
scopes are nested, and it should be sufficient to specify just one bit in
the mask and it could "automatically include" lower bits. The optimizer
does not need to know that. In fact the implementation need not even be a
bitmask. It can just be a set of opaque "sigils" like in the original
design.
In practice, I am wondering how often will scopes really affect
optimizations. At least on targets that have memory models similar to
OpenCL 2.x, it's likely that most queries have answers independent of
scopes.
Sameer.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161007/1ab39028/attachment.html>
More information about the llvm-dev
mailing list