[LLVMdev] Address space extension

Thu Aug 8 08:15:25 PDT 2013

On 08/08/2013 02:05 PM, Justin Holewinski wrote:
> On Wed, Aug 7, 2013 at 9:52 PM, Pete Cooper <peter_cooper at apple.com
> <mailto:peter_cooper at apple.com>> wrote:
>
>
>     On Aug 7, 2013, at 6:38 PM, Michele Scandale
>     <michele.scandale at gmail.com <mailto:michele.scandale at gmail.com>> wrote:
>
>>     On 08/08/2013 03:16 AM, Pete Cooper wrote:
>>>
>>>     On Aug 7, 2013, at 5:12 PM, Michele Scandale
>>>     <michele.scandale at gmail.com <mailto:michele.scandale at gmail.com>>
>>>     wrote:
>>>
>>>>     On 08/08/2013 02:02 AM, Justin Holewinski wrote:
>>>>>     This worries me a bit.  This would introduce language-specific
>>>>>     processing into SelectionDAG.  OpenCL maps address spaces one
>>>>>     way, other
>>>>>     languages map them in other ways.  Currently, it is the job of the
>>>>>     front-end to map pointers into the correct address space for
>>>>>     the target
>>>>>     (hence the address space map in clang).  With (my understanding
>>>>>     of) this
>>>>>     proposal, there would be a pre-defined set of language-specific
>>>>>     address
>>>>>     spaces that the target would need to know about. IMO it should
>>>>>     be the
>>>>>     job of the front-end to do this mapping.
>>>>
>>>>     The begin of the discussion was about possible way to represent
>>>>     high level address space information in the IR different from
>>>>     target address spaces (to have the information orthogonally
>>>>     respect the mapping so to handle also those targets that have
>>>>     the trivial mapping).
>>>>
>>>>     My interpretation of the solution proposed by Pete is that the
>>>>     frontend emits metadata that describe address spaces
>>>>     (overlapping information and mapping target specific). The
>>>>     instruction selection simply applis the mapping encoded in the
>>>>     metadata. So there is no pre-defined set, but there is only a
>>>>     mapping algorithm implemented in the instruction selection phase
>>>>     "table driven", the table is encoded as metadata.
>>>     I think its fair to have this be dealt with by targets instead of
>>>     the front-end.  That way the optimizer can remain generic and use
>>>     only the metadata.  CPU targets will just map every address space
>>>     to 0 as they have only a single physical memory space.  GPU
>>>     targets such as PTX and R600 can map to the actual HW spaces they
>>>     want.
>>
>>     Why a backend should be responsible (meaning have knowledge) for a
>>     mapping between high level address spaces and low level address
>>     spaces?
>     Thats true.  I’m thinking entirely from the persecutive of the
>     backend doing CL/CUDA.  But actually LLVM is language agnostic.
>       That is still something the metadata could solve.  The front-end
>     could generate the metadata i suggested earlier which will tell the
>     backend how to do the mapping.  Then the backend only needs to read
>     the metadata.
>
>>
>>     Why X86 backend should be aware of opencl address spaces or any
>>     other address spaces?
>     The only reason i can think of is that this allows the address space
>     alias analysis to occur, and all of the optimizations you might want
>     to implement on top of it.  Otherwise you’ll need the front-end to
>     put everything in address space 0 and you’ll have lost some
>     opportunity to optimize in that way for x86.
>
>>
>>     Like for other aspects I see more direct and intuitive to
>>     anticipate target information in the frontend (this is already
>>     done and accepted) to have a middle-end and back-end source
>>     language dependent (no specific language knowledge is required,
>>     because different frontends could be built on top of this).
>>
>>     Maybe a way to decouple the frontend and the specific target is
>>     possible in order to have in the target independent part of the
>>     code-generator a support for a set of language with common concept
>>     (like opencl/cuda) but it's still language dependent!
>     Yes, that could work.  Actually the numbers are probably not the
>     important thing.  Its the names that really tell you what the
>     address space is for.  The backend needs to know what loading from a
>     local means.  Its almost unimportant what specific number a
>     front-end chooses for that address space.  We know the front-end is
>     really going to choose 2 (from what you said earlier), but the
>     backend just needs to know how to load/store a local.
>
>     So perhaps the front-end should really be generating metadata which
>     tells the target what address space it chose for a memory space.
>       That is
>
>     !private_memory = metadata !{ i32 0 }
>     !global_memory = metadata !{ i32 1 }
>     !local_memory = metadata !{ i32 2 }
>     !constant_memory = metadata !{ i32 3 }
>
>
> This is specific to an OpenCL front-end.  How would this translate to a
> language with a different memory hierarchy?
>
> I would also like to preserve the ability for front-ends to directly
> assign address spaces in a target-dependent manner.  Currently, I can
> write IR that explicitly assigns global variables to the PTX "shared"
> address space (for example).  Under this proposal, I would need to use
> address space 2 (because that is what has been decreed as OpenCL
> "local"), and insert meta-data that tells the PTX back-end to map this
> to its "shared" address space.  Is that correct?

The address space representation as numbers done by the front-end I 
think it would be language dependent: values used in CUDA may be 
different from the one used in OpenCL.
I understand that it may be better to have it also target-dependent.
But this is an decision for the frontend implementation.

-Michele