[LLVMdev] Address space extension

Justin Holewinski justin.holewinski at gmail.com
Thu Aug 8 05:05:33 PDT 2013


On Wed, Aug 7, 2013 at 9:52 PM, Pete Cooper <peter_cooper at apple.com> wrote:

>
> On Aug 7, 2013, at 6:38 PM, Michele Scandale <michele.scandale at gmail.com>
> wrote:
>
> On 08/08/2013 03:16 AM, Pete Cooper wrote:
>
>
> On Aug 7, 2013, at 5:12 PM, Michele Scandale <michele.scandale at gmail.com>
> wrote:
>
> On 08/08/2013 02:02 AM, Justin Holewinski wrote:
>
> This worries me a bit.  This would introduce language-specific
> processing into SelectionDAG.  OpenCL maps address spaces one way, other
> languages map them in other ways.  Currently, it is the job of the
> front-end to map pointers into the correct address space for the target
> (hence the address space map in clang).  With (my understanding of) this
> proposal, there would be a pre-defined set of language-specific address
> spaces that the target would need to know about. IMO it should be the
> job of the front-end to do this mapping.
>
>
> The begin of the discussion was about possible way to represent high level
> address space information in the IR different from target address spaces
> (to have the information orthogonally respect the mapping so to handle also
> those targets that have the trivial mapping).
>
> My interpretation of the solution proposed by Pete is that the frontend
> emits metadata that describe address spaces (overlapping information and
> mapping target specific). The instruction selection simply applis the
> mapping encoded in the metadata. So there is no pre-defined set, but there
> is only a mapping algorithm implemented in the instruction selection phase
> "table driven", the table is encoded as metadata.
>
> I think its fair to have this be dealt with by targets instead of the
> front-end.  That way the optimizer can remain generic and use only the
> metadata.  CPU targets will just map every address space to 0 as they have
> only a single physical memory space.  GPU targets such as PTX and R600 can
> map to the actual HW spaces they want.
>
>
> Why a backend should be responsible (meaning have knowledge) for a mapping
> between high level address spaces and low level address spaces?
>
> Thats true.  I’m thinking entirely from the persecutive of the backend
> doing CL/CUDA.  But actually LLVM is language agnostic.  That is still
> something the metadata could solve.  The front-end could generate the
> metadata i suggested earlier which will tell the backend how to do the
> mapping.  Then the backend only needs to read the metadata.
>
>
> Why X86 backend should be aware of opencl address spaces or any other
> address spaces?
>
> The only reason i can think of is that this allows the address space alias
> analysis to occur, and all of the optimizations you might want to implement
> on top of it.  Otherwise you’ll need the front-end to put everything in
> address space 0 and you’ll have lost some opportunity to optimize in that
> way for x86.
>
>
> Like for other aspects I see more direct and intuitive to anticipate
> target information in the frontend (this is already done and accepted) to
> have a middle-end and back-end source language dependent (no specific
> language knowledge is required, because different frontends could be built
> on top of this).
>
> Maybe a way to decouple the frontend and the specific target is possible
> in order to have in the target independent part of the code-generator a
> support for a set of language with common concept (like opencl/cuda) but
> it's still language dependent!
>
> Yes, that could work.  Actually the numbers are probably not the important
> thing.  Its the names that really tell you what the address space is for.
>  The backend needs to know what loading from a local means.  Its almost
> unimportant what specific number a front-end chooses for that address
> space.  We know the front-end is really going to choose 2 (from what you
> said earlier), but the backend just needs to know how to load/store a local.
>
> So perhaps the front-end should really be generating metadata which tells
> the target what address space it chose for a memory space.  That is
>
> !private_memory = metadata !{ i32 0 }
> !global_memory = metadata !{ i32 1 }
> !local_memory = metadata !{ i32 2 }
> !constant_memory = metadata !{ i32 3 }
>

This is specific to an OpenCL front-end.  How would this translate to a
language with a different memory hierarchy?

I would also like to preserve the ability for front-ends to directly assign
address spaces in a target-dependent manner.  Currently, I can write IR
that explicitly assigns global variables to the PTX "shared" address space
(for example).  Under this proposal, I would need to use address space 2
(because that is what has been decreed as OpenCL "local"), and insert
meta-data that tells the PTX back-end to map this to its "shared" address
space.  Is that correct?


>
> Unfortunately you’d have to essentially reserve those metadata names for
> your use (better names than i chose of course), but this might be
> reasonable.  You could alternately use the example I first gave, but just
> add a name field to it.
>
> I guess targets would have to either assert or default to address space 0
> when they see an address space without associated metadata.
>
>
> This way you have the target specific information in the backend where I
> believe it should be, and the front-end can target agnostic (note, I know,
> its not really agnostic and already contains target specific information,
> but I just don’t want to add more unless its really needed)
>
> On the casting between address spaces topic "you can cast between the
> generic address space and global/local/private, so there's also that to
> consider.”.  This terrifies me.  I don’t know how to generate code for this
> on a system which has disjoint physical memory without branching on every
> memory access to that address space.
>
>
> The OpenCL 2.0 specification says that a runtime resolution to a named
> address spaced is required in order to use a pointer in the generic address
> space.
>
> Ouch!  I can’t imagine thats good for performance on some architectures.
>  But at least its been considered and defined.
>
> Pete
>
>
>
> -Michele
>
>
>


-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130808/ebdcd410/attachment.html>


More information about the llvm-dev mailing list