[PATCH] D31042: Allow DataLayout to specify addrspace for allocas.

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 28 14:51:46 PDT 2017


arsenm added a comment.

In https://reviews.llvm.org/D31042#712484, @rjmccall wrote:

> In https://reviews.llvm.org/D31042#712324, @arsenm wrote:
>
> > In https://reviews.llvm.org/D31042#712124, @rjmccall wrote:
> >
> > > In https://reviews.llvm.org/D31042#711913, @arsenm wrote:
> > >
> > > > In https://reviews.llvm.org/D31042#711863, @rjmccall wrote:
> > > >
> > > > > I don't think the assumptions in Clang that the Clang AST address space 0 equals LLVM's address space 0 are nearly as pervasive as you think.  At any rate, since you require stack allocations to be in a different LLVM address space from the generic address space, but (per C) taking the address of a local variable yields a pointer in the generic address space, you're going to have to teach Clang about these differences between address spaces anyway, because it'll have to insert address-space conversions in a number of places.
> > > >
> > > >
> > > > Yes, I expect there will be many addrspacecasts inserted from the alloca, which then the InferAddressSpaces pass would hopefully optimize away.
> > >
> > >
> > > Okay.  So, to summarize:
> > >
> > > In amdgpu OpenCL, at source level, sizeof(void __private *) == sizeof(void {__global,__local,__constant} *) == 8.  However, at an implementation level, stack addresses are actually 32-bit, and you want that to be modeled correctly in the IR.  Therefore, allocas need to return a value in a 32-bit address space, which the frontend will immediately widen to a 64-bit address space in order to match the rules for __private.  In order to maintain performance, you have a pass that recognizes when an address value is the result of that kind of widening and re-narrows it to the 32-bit stack address space.
> >
> >
> > This is mostly accurate except for the detail about the cast. For OpenCL 2.0 or C++ the alloca pointer will most likely need to be addrspacecasted to the OpenCL generic address space.
>
>
> It looks to me like the language spec says that the address of a local variable is in the __private address space, but that a pointer in any address space can be promoted into the generic address space.  Clang's AST should call out that second promotion as a separate ImplicitCastExpr.  Of course, you may decide that you want to peephole it into a single addrspacecast instruction.
>
> > Private pointers in memory may need to be zero extended to 64-bit in memory. For OpenCL 1.x there is no generic address space, and the old subtargets don't have the necessary hardware features to support this.
>
> By "this", you mean an efficiently-accessible generic address space?  Presumably because the different memory-access operations work on specific address spaces and so, as you say, you would need creative codegen to figure out which case the pointer originally belonged to.  That makes sense.


Yes

>>> Because LLVM currently makes some unfortunate assumptions about address space 0, this stack address space cannot be address space 0.  These assumptions include:
>>>  (1) As far as debug info is concerned, the size of a pointer is the size of a pointer in address space 0.  (Is this assumption why sizeof(void __private *) == 8 despite __private being clearly intended in the OpenCL spec to be the address space of stack pointers?)
>> 
>> This is a point that the spec is vague on. While the spec explicitly allows different size pointers for different address spaces, there isn't really any detailed description of what this entails.
> 
> I don't see what the problem is.  The spec doesn't allow pointers to be converted between address spaces at all (except for the promotion into the generic address space in OpenCL 2.0).  Pointers into different address spaces are just completely different types.

The problem is mostly what happens if the host and device pointer size don't match? Struct layouts etc. still need to be compatible for whatever memory buffer was passed into the kernel. There isn't much practical reason to do it, but you could have a struct with private pointer members in it changing the offsets of the other items, not that you could do anything valid with the contents.


https://reviews.llvm.org/D31042





More information about the llvm-commits mailing list