[PATCH] D31042: Allow DataLayout to specify addrspace for allocas.

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 28 10:58:30 PDT 2017


arsenm added a comment.

In https://reviews.llvm.org/D31042#712124, @rjmccall wrote:

> In https://reviews.llvm.org/D31042#711913, @arsenm wrote:
>
> > In https://reviews.llvm.org/D31042#711863, @rjmccall wrote:
> >
> > > I don't think the assumptions in Clang that the Clang AST address space 0 equals LLVM's address space 0 are nearly as pervasive as you think.  At any rate, since you require stack allocations to be in a different LLVM address space from the generic address space, but (per C) taking the address of a local variable yields a pointer in the generic address space, you're going to have to teach Clang about these differences between address spaces anyway, because it'll have to insert address-space conversions in a number of places.
> >
> >
> > Yes, I expect there will be many addrspacecasts inserted from the alloca, which then the InferAddressSpaces pass would hopefully optimize away.
>
>
> Okay.  So, to summarize:
>
> In amdgpu OpenCL, at source level, sizeof(void __private *) == sizeof(void {__global,__local,__constant} *) == 8.  However, at an implementation level, stack addresses are actually 32-bit, and you want that to be modeled correctly in the IR.  Therefore, allocas need to return a value in a 32-bit address space, which the frontend will immediately widen to a 64-bit address space in order to match the rules for __private.  In order to maintain performance, you have a pass that recognizes when an address value is the result of that kind of widening and re-narrows it to the 32-bit stack address space.


This is mostly accurate except for the detail about the cast. For OpenCL 2.0 or C++ the alloca pointer will most likely need to be addrspacecasted to the OpenCL generic address space. Private pointers in memory may need to be zero extended to 64-bit in memory. For OpenCL 1.x there is no generic address space, and the old subtargets don't have the necessary hardware features to support this.This is one of reasons I don't want to use the same workaround NVPTX uses for this problem. There would be a pointer that if not optimized out would require creative codegen work to artificially support it on those targets.

> Because LLVM currently makes some unfortunate assumptions about address space 0, this stack address space cannot be address space 0.  These assumptions include:
>  (1) As far as debug info is concerned, the size of a pointer is the size of a pointer in address space 0.  (Is this assumption why sizeof(void __private *) == 8 despite __private being clearly intended in the OpenCL spec to be the address space of stack pointers?)

This is a point that the spec is vague on. While the spec explicitly allows different size pointers for different address spaces, there isn't really any detailed description of what this entails. This wasn't addressed at all by the SPIR spec, which just defined a single 64-bit datalayout string. sizeof really indicates the in-memory representation's size, so by using a DataLayout with 32-bit pointers with an ABI alignment of 64-bit, we get the necessary codegen properties of a 32-bit pointer while maintaining the required ABI and single sizeof().

> (2) Functions are currently always defined in address space 0.  A function pointer in your implementation must be in a 64-bit address space.  (My understanding was that OpenCL generally forbade function pointers, so I'm not sure if this is actually important?)

It is true that OpenCL doesn't allow function pointers at the source level. I'm working on implementing calls without inlining everything now, and to be able to correctly lower LLVM IR for them requires the DAG picking the right MVT when emitting the global address so it still comes up there. Function pointer support is also helpful for implementing other languages being worked on like C++ and python. I believe recent version of CUDA support them as well.

> (3) Some optimizations assume that 0 is an invalid address in address space 0, but in fact it's a valid stack address.
> 
> It's entirely possible that we should, indeed, change all of these assumptions.  For the time being, you feel that's not a reasonable fight, and so you would like to allow IR to use an explicitly non-0 address space for the stack.

Yes. Some day I would like to be able to apply some of these assumptions to other non-0 address spaces as well (i.e. 0 is null in some address spaces, but not others) but I think that is a larger task.

> I think I'm prepared to accept your argument.
> 
>>>> I don't think adding this DataLayout argument is very invasive, and was less work than I anticipated
>>> 
>>> That's because you aren't changing any of the frontends, most of which are not in-tree.  I understand that LLVM does not promise to keep the C++ API stable, but there's surely *some* responsibility to not break every frontend for features they don't care about.  At the very least, yes, you should be picking up this value from the Module instead of requiring it to be passed in.
>> 
>> So your objection is just to the API change? I can change that if you insist. The actual mechanical adding of the argument isn't the source of the work. As someone who has maintained out of tree uses, this class of change is the least irritating. Most of the clang IR gen work is in updating the downstream users which use hardcoded pointer types, which users not using this don't need to care about it.
> 
> Yes, I would like you to avoid changing the IRBuilder API if you can.  I can see why you need to change the AllocaInst constructors, though.
> 
> Do you actually need this to be printed/parsed in .ll files?  It's globally consistent across all alloca instructions, and the data layout is always parsed before anything else in the file.  It's not the end of the world, just seems odd to imply that this is actually something that can be different on different instructions.

No, this was just a request in the RFC thread to make it more clear. I can maybe envision a use case in the future for having a different address space on individual allocas, but that is not a concern I have today.


https://reviews.llvm.org/D31042





More information about the llvm-commits mailing list