[cfe-dev] Language-specific vs target-specific address spaces (was Re: [LLVMdev] [PATCH] OpenCL support - update on keywords)

Mon Feb 28 13:41:00 PST 2011

On Fri, Feb 25, 2011 at 02:55:33PM -0500, Ken Dyck wrote:
> The address space mechanism is used by some code generators to
> differentiate between physical memory spaces. The PIC16 code generator
> uses address spaces 0 and 1 to select between its RAM and ROM spaces.
> And X86 uses address space 256 for GS and 257 for FS. In the back end
> for a dual-harvard DSP that I've been working on, I use address spaces
> 0-3 to designate the various memories on the machine.
> 
> The enum conflicts are easy enough to fix, but this current
> implementation doesn't seem to leave room to specify both language-
> and target-specific options on the same pointer. For example, when
> developing an app for a PIC16, how would a user specify a pointer to a
> CONSTANT variable in the ROM space?
> 
> Perhaps we could reserve separate bitfields within the address space
> number for language- and target-specific options. The OpenCL code
> would then need to shift and OR its constants with any address space
> numbers specified with the __attribute__ syntax.

The more I think about it, the more I become uncomfortable with the
concept of language-specific address spaces in LLVM.  These are the
main issues I see with language-specific address spaces:

Firstly, it forces every target to 'know' about each source language,
requiring (potentially) modification of each target for each new
frontend language with multiple targets.  This goes against the LLVM
design principle of language independence, and encourages frontends
to reuse (abuse?) address spaces which are meant for other languages.

Secondly, consider the issue of language interoperability (e.g. a
hypothetical CUDA <-> OpenCL interop layer) -- we either lose the
ability to pass pointers between languages in a type-safe way or end
up giving awkward names to address spaces.

Instead of language-specific address spaces, each target should
concentrate on exposing all of its address spaces as target-specific
address spaces, and frontends should use a language -> target mapping
in target-specific code.  We can continue to expose the target's main
shared writable address space as address space 0 as we do now.

For example, Clang could define a set of internal address space
constants for OpenCL and use TargetCodeGenInfo to provide the mapping
to target address spaces.

An additional benefit is that this solution would allow AMD and
other backends with non-standard orderings [1] to retain backward
compatibility.

In Clang, by default, pointers would be in language address space 0,
which could map to any target address space (normally 0).  This neatly
resolves the "default address space" problem for devices with a
nonzero private address space (although on the LLVM side we would
need an address-space-aware alloca).

Thanks,
-- 
Peter

[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-February/038199.html