[LLVMdev] Address space extension
Micah Villmow
micah.villmow at smachines.com
Thu Aug 8 03:06:29 PDT 2013
My view is modules with different data layouts should be considered incompatible. Data layouts are inherently target/language specific and I don't view it any different than combining IR modules compiled for different architectures.
Micah
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu]
> On Behalf Of David Chisnall
> Sent: Thursday, August 08, 2013 2:04 AM
> To: Pete Cooper
> Cc: LLVM Developers Mailing List
> Subject: Re: [LLVMdev] Address space extension
>
> On 8 Aug 2013, at 04:23, Pete Cooper <peter_cooper at apple.com> wrote:
>
> >
> > On Aug 7, 2013, at 7:23 PM, Michele Scandale
> <michele.scandale at gmail.com> wrote:
> >
> >> On 08/08/2013 03:52 AM, Pete Cooper wrote:
> >>
> >> From here I understand that in the IR there are addrspace(N) where
> N=0,1,2,3,... according to the target independent mapping done by the
> frontend to represent different address spaces (for OpenCL 1.2 0 = private, 1
> = global, 2 = local, 3 = constant).
> >>
> >> Then the frontend emits metadata that contains the map from "language
> address spaces" to "target address spaces" (for X86 would be 0->0 1->0 2->0
> 3->0).
> >>
> >> Finally the instruction selection will use these informations to perform the
> instruction selection correctly and tagging the machine instruction with both
> logical and physical address spaces.
> > Sounds good.
>
> What happens when I link together two IR modules from different front
> ends that have different language-specific address spaces?
>
> I would be very hesitant about using address spaces until we've fixed their
> semantics to disallow bitcasts between different address spaces and require
> an explicit address space cast. To illustrate the problem, consider the
> following trivial example:
>
> typedef __attribute__((address_space(256))) int* gsptr;
>
> int *toglobal(gsptr foo)
> {
> return (int*)foo;
> }
>
> int load(int *foo)
> {
> return *foo;
> }
>
> int loadgs(gsptr foo)
> {
> return *foo;
> }
>
> int loadgs2(gsptr foo)
> {
> return *toglobal(foo);
> }
>
> When we compile this to LLVM IR with clang (disabling asynchronous unwind
> tables for clarity), at -O2 we get this:
>
> define i32* @toglobal(i32 addrspace(256)* %foo) nounwind readnone ssp {
> %1 = bitcast i32 addrspace(256)* %foo to i32*
> ret i32* %1
> }
>
> define i32 @load(i32* nocapture %foo) nounwind readonly ssp {
> %1 = load i32* %foo, align 4, !tbaa !0
> ret i32 %1
> }
>
> define i32 @loadgs(i32 addrspace(256)* nocapture %foo) nounwind readonly
> ssp {
> %1 = load i32 addrspace(256)* %foo, align 4, !tbaa !0
> ret i32 %1
> }
>
> define i32 @loadgs2(i32 addrspace(256)* nocapture %foo) nounwind
> readonly ssp {
> %1 = bitcast i32 addrspace(256)* %foo to i32*
> %2 = load i32* %1, align 4, !tbaa !0
> ret i32 %2
> }
>
> Note that in loadgs2, the call to toglobal has been inlined and so the back end
> will just see a bitcast, which SelectionDAG treats as a no-op. The assembly
> we get from this is:
>
> _toglobal: ## @toglobal
> ## BB#0:
> pushq %rbp
> movq %rsp, %rbp
> movq %rdi, %rax
> popq %rbp
> ret
> load: ## @load
> ## BB#0:
> pushq %rbp
> movq %rsp, %rbp
> movl (%rdi), %eax
> popq %rbp
> ret
>
> .globl _loadgs
> .align 4, 0x90
> loadgs: ## @loadgs
> ## BB#0:
> pushq %rbp
> movq %rsp, %rbp
> movl %gs:(%rdi), %eax
> popq %rbp
> ret
>
> .globl _loadgs2
> .align 4, 0x90
> loadgs2: ## @loadgs2
> ## BB#0:
> pushq %rbp
> movq %rsp, %rbp
> movl (%rdi), %eax
> popq %rbp
> ret
>
> loadgs() has been compiled correctly. It uses the parameter as a gs-relative
> address and performs the load. The assembly for load() and loadgs2(),
> however, are identical: both are treating the parameter as a linear (not gs-
> relative) address. The cast has been lost. This is even simpler when you look
> at toglobal(), which has just become a noop. The correct code for this should
> be (I believe):
>
> _toglobal: ## @toglobal
> ## BB#0:
> pushq %rbp
> movq %rsp, %rbp
> lea %gs:(%rdi), %rax
> popq %rbp
> ret
>
> In the inlined version, the lea and movl should be combined into a single gs-
> relativel movl.
>
> Until we can generate correct code from IR containing address spaces,
> discussion of how to optimise this IR seems premature.
>
> David
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list