[LLVMdev] memory scopes in atomic instructions
Sahasrabuddhe, Sameer
sameer.sahasrabuddhe at amd.com
Sun Nov 16 22:13:41 PST 2014
On 11/17/2014 10:51 AM, Owen Anderson wrote:
> It is already the case that address spaces can (potentially) alias.
> As such, the combination of address spaces and memory scopes can
> represent any combination where the sharing properties of memory are
> statically known, simply by having (potentially aliasing) address
> spaces to represent memory pools that are only shared with a specific
> combinations of agents. One can imagine a GPU that worked like this,
> and GPU programming models do generally differentiating various
> sharing pools statically.
I am trying to understand this with a concrete example. OpenCL 2.0
allows atomic instructions in the global address space, which is encoded
as "1" in the SPIR target. The possible memory scopes are work_item,
work_group, device and all_svm_devices. We could resolve the global
address spaces into four statically known "synchronization pools", say
"global_work_item", "global_work_group", etc. They would all alias with
the real global address space, and could be encoded as new address
spaces, is that correct? Then we wouldn't even need the memory scope
argument on the atomic instruction, right?
Note that "global_work_item" isn't even a real address space, i.e., it
is not a well-defined sequence of addresses that is located somewhere in
the global address space. It's actually the set of all global locations
that can potentially be accessed by atomic instructions using
"work_item" memory scope in a given program. It is not required to be
contiguous, and can alias with the entire global address space in the
worst case.
So this is what it looks like to me: The proposal is to encode memory
scopes as a new field that is orthogonal to address spaces. Address
spaces are defined on locations, while memory scopes are defined on
operations. Every combination of an address space and a memory scope
represents a set of instructions synchronizing with a set of agents
through a set of locations in that address space. The first two sets are
statically known (not considering the effect of control flow on the
instructions). But the set of locations is dynamic, and could span the
whole address space in the absence of aliasing information.
> The case that this doesn’t handle is when the sharing properties are
> not known statically. However, I question the utility of designing
> this, since there are no known systems that require it. We should
> design the representation to cover all reasonably anticipated systems,
> not ones that don’t, and have no prospect of, existing.
Sure. But we could just leave this undefined for now, without losing the
ability to express what we need. The idea is to not specify any
semantics on non-zero memory scopes (such as assuming that they have a
nesting order).
Sameer.
More information about the llvm-dev
mailing list