[LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords)

Wed Mar 2 15:14:09 PST 2011

On Wed, Mar 2, 2011 at 9:38 AM, Ken Dyck <kd at kendyck.com> wrote:
>
> You can trace back the origins of the addrspace attribute in the
> mailing list archives to this thread:
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2007-November/011385.html.
> From there, it is pretty clear that addrspace was introduced
> specifically as a mechanism for implementing the 'named address space'
> extensions defined in the Embedded C standard (ISO/IEC TR 18037,
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf).
>
...
>
> If you dig into the Embedded C standard, you'll find that the 'named
> address space' extension is highly target-specific. It is only
> portable insofar as two target processors have similar memory
> organization and use identical names for their address spaces.

Yes, I have read that section of the Embedded C standard.
I agree that named address spaces are target specific.

I read OpenCL's use of address spaces as a subsetting of the
Embedded C concept of address spaces, i.e. a specific set of them, with
specific names, and language-level restrictions.

I disagree with Speziale.   OpenCL address spaces are not "scopes"
or scope-like things.  OpenCL address spaces describe disjoint storage
locations with incompatible pointer types, and with independent address
numbering (i.e. if you happened to cast a pointer to an integer).

For example, Clang makes sure that you can't assign a pointer to one
address space into a pointer to another address space.
"illegal implicit conversion between two pointers with different
address spaces".
OpenCL has that restriction as well: Section 6.8 paragraph a.

So I argue you really do want to use Clang address spaces to represent
OpenCL address spaces.

Speziale wrote:
> The OpenCL standard talks about addess spaces, but I think they can be
> interpreted as scopes (except __constants):
>
> * __global: globally accessible variables
> * __private: visible only to a work item
> * __local: accessible by all work item in a work group
>
> The address space is the way scoping rules are implemented in hardware,
> e.g __local variables are mapped in the address space X which is a fast
> memory shared by all ALU inside a GPU multiprocessor. Maybe introducing
> such "scopes", it is possible to decouple backends fom frontends.
>
> __constant is a corner case: it can be modelled as a global scope that
> contains read only data

But note that OpenCL doesn't even express work items at the language
level: it's all implied by the workings of the runtime.   What
Speziale calls scoping in hardware is really just the GPU view of how
the program is run:  yes, it was the original model, but not the only
one.
If you run the work items and work groups serially, then it's not
longer "scoping" but rather data lifetime that comes into play:  each
__private lives only as long as its work item; each __local lives as
long as the work group, each __global lives as long as the buffer is
in global memory (possibly multiple kernel executions).   (Yes, I
understand that a program with barrier() calls requires some
parallelism or interleaving between different work items.)

But those lifetimes/scoping rules are extra semantics over and above
the baseline (restricted) concept of address spaces from Embedded C.

On Wed, Mar 2, 2011 at 9:38 AM, Ken Dyck <kd at kendyck.com> wrote:
>
> So the reason that there aren't any conventions for the address space
> numbers in clang/llvm is because there aren't any conventions for how
> chip designers incorporate memories into the architectures that they
> design.

Sure.

>
> The one convention that the Embedded C standard does specify is that
> when the address space of a type is unspecified, the type is assumed
> to be in the 'generic' space. Clang currently emits an address space
> of zero in this case. Arguably, LLVM could define a single enum value,
> GENERIC, for use by the code generators.

Similarly, OpenCL says that anything without an address space
qualifier ends up in __private.

>
> In my opinion, any knowledge that front ends have of address spaces
> should be dictated by the target's back end. Perhaps we should add
> some virtual methods to LLVM's TargetMachine interface so front ends
> can query the back end for the names and numbers of the address spaces
> that they recognize, and expose them to end users in a standard way.
> But having front ends impose the requirement on back ends that they
> recognize some arbitrary set of language-specific address spaces seems
> like a great misuse of the feature to me for reasons that Peter has
> already pointed out.

This makes sense to me.   So the TargetMachine would advertise what
address spaces it has, and how they map to OpenCL address spaces (if at all).
Then Clang could error out gracefully if the user is compiling OpenCL code and
the target doesn't support OpenCL.  That addresses the basic validity issue.

If the target does support OpenCL, then the front end would dynamically adopt
whatever backend numbers were defined by the target.
We should probably keep the convention that address space 0 is the
generic space, always.

This neatly solves the ARM vs. someone-else difference between
numberings of local vs. constant.

>
> It seems to me, as Speziale already pointed out, that the OpenCL type
> qualifiers aren't address space qualifiers at all (in the Embedded C
> sense). They might be better implemented as a separate set of
> qualifiers in the way that Objective-C defines its garbage-collection
> qualifiers, __strong and __weak. See the Qualifiers class in
> AST/Type.h.

I very much disagree, for reasons I gave above.

Sorry if I've gone on too long.

cheers,
david