[LLVMdev] Address space extension
Michele Scandale
michele.scandale at gmail.com
Thu Aug 8 08:15:25 PDT 2013
On 08/08/2013 02:05 PM, Justin Holewinski wrote:
> On Wed, Aug 7, 2013 at 9:52 PM, Pete Cooper <peter_cooper at apple.com
> <mailto:peter_cooper at apple.com>> wrote:
>
>
> On Aug 7, 2013, at 6:38 PM, Michele Scandale
> <michele.scandale at gmail.com <mailto:michele.scandale at gmail.com>> wrote:
>
>> On 08/08/2013 03:16 AM, Pete Cooper wrote:
>>>
>>> On Aug 7, 2013, at 5:12 PM, Michele Scandale
>>> <michele.scandale at gmail.com <mailto:michele.scandale at gmail.com>>
>>> wrote:
>>>
>>>> On 08/08/2013 02:02 AM, Justin Holewinski wrote:
>>>>> This worries me a bit. This would introduce language-specific
>>>>> processing into SelectionDAG. OpenCL maps address spaces one
>>>>> way, other
>>>>> languages map them in other ways. Currently, it is the job of the
>>>>> front-end to map pointers into the correct address space for
>>>>> the target
>>>>> (hence the address space map in clang). With (my understanding
>>>>> of) this
>>>>> proposal, there would be a pre-defined set of language-specific
>>>>> address
>>>>> spaces that the target would need to know about. IMO it should
>>>>> be the
>>>>> job of the front-end to do this mapping.
>>>>
>>>> The begin of the discussion was about possible way to represent
>>>> high level address space information in the IR different from
>>>> target address spaces (to have the information orthogonally
>>>> respect the mapping so to handle also those targets that have
>>>> the trivial mapping).
>>>>
>>>> My interpretation of the solution proposed by Pete is that the
>>>> frontend emits metadata that describe address spaces
>>>> (overlapping information and mapping target specific). The
>>>> instruction selection simply applis the mapping encoded in the
>>>> metadata. So there is no pre-defined set, but there is only a
>>>> mapping algorithm implemented in the instruction selection phase
>>>> "table driven", the table is encoded as metadata.
>>> I think its fair to have this be dealt with by targets instead of
>>> the front-end. That way the optimizer can remain generic and use
>>> only the metadata. CPU targets will just map every address space
>>> to 0 as they have only a single physical memory space. GPU
>>> targets such as PTX and R600 can map to the actual HW spaces they
>>> want.
>>
>> Why a backend should be responsible (meaning have knowledge) for a
>> mapping between high level address spaces and low level address
>> spaces?
> Thats true. I’m thinking entirely from the persecutive of the
> backend doing CL/CUDA. But actually LLVM is language agnostic.
> That is still something the metadata could solve. The front-end
> could generate the metadata i suggested earlier which will tell the
> backend how to do the mapping. Then the backend only needs to read
> the metadata.
>
>>
>> Why X86 backend should be aware of opencl address spaces or any
>> other address spaces?
> The only reason i can think of is that this allows the address space
> alias analysis to occur, and all of the optimizations you might want
> to implement on top of it. Otherwise you’ll need the front-end to
> put everything in address space 0 and you’ll have lost some
> opportunity to optimize in that way for x86.
>
>>
>> Like for other aspects I see more direct and intuitive to
>> anticipate target information in the frontend (this is already
>> done and accepted) to have a middle-end and back-end source
>> language dependent (no specific language knowledge is required,
>> because different frontends could be built on top of this).
>>
>> Maybe a way to decouple the frontend and the specific target is
>> possible in order to have in the target independent part of the
>> code-generator a support for a set of language with common concept
>> (like opencl/cuda) but it's still language dependent!
> Yes, that could work. Actually the numbers are probably not the
> important thing. Its the names that really tell you what the
> address space is for. The backend needs to know what loading from a
> local means. Its almost unimportant what specific number a
> front-end chooses for that address space. We know the front-end is
> really going to choose 2 (from what you said earlier), but the
> backend just needs to know how to load/store a local.
>
> So perhaps the front-end should really be generating metadata which
> tells the target what address space it chose for a memory space.
> That is
>
> !private_memory = metadata !{ i32 0 }
> !global_memory = metadata !{ i32 1 }
> !local_memory = metadata !{ i32 2 }
> !constant_memory = metadata !{ i32 3 }
>
>
> This is specific to an OpenCL front-end. How would this translate to a
> language with a different memory hierarchy?
>
> I would also like to preserve the ability for front-ends to directly
> assign address spaces in a target-dependent manner. Currently, I can
> write IR that explicitly assigns global variables to the PTX "shared"
> address space (for example). Under this proposal, I would need to use
> address space 2 (because that is what has been decreed as OpenCL
> "local"), and insert meta-data that tells the PTX back-end to map this
> to its "shared" address space. Is that correct?
The address space representation as numbers done by the front-end I
think it would be language dependent: values used in CUDA may be
different from the one used in OpenCL.
I understand that it may be better to have it also target-dependent.
But this is an decision for the frontend implementation.
-Michele
More information about the llvm-dev
mailing list