[cfe-commits] [llvm-commits] Patch: Add get_global_id builtin/intrinsic
Villmow, Micah
Micah.Villmow at amd.com
Tue Feb 14 08:42:42 PST 2012
> -----Original Message-----
> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-
> bounces at cs.uiuc.edu] On Behalf Of Tom Stellard
> Sent: Tuesday, February 07, 2012 12:53 PM
> To: Peter Collingbourne
> Cc: llvm-commits at cs.uiuc.edu; cfe-commits at cs.uiuc.edu
> Subject: Re: [llvm-commits] [cfe-commits] Patch: Add get_global_id
> builtin/intrinsic
>
> On Tue, Feb 07, 2012 at 08:16:23PM +0000, Peter Collingbourne wrote:
> > On Fri, Feb 03, 2012 at 05:24:38PM -0500, Tom Stellard wrote:
> > > On Fri, Feb 03, 2012 at 10:07:03PM +0000, Rotem, Nadav wrote:
> > > > Tom,
> > > >
> > > > Our OpenCL implementation of get_global_id is not target specific
> and we don't resolve it in the backend. I think that get_global_id
> should be implemented as a simple library call. However, this is
> something that needs to be discussed with Tanya Lattner, Peter
> Collingbourne, Anton Lokhmotov, etc.
> > > >
> > > > Nadav
> > >
> > > Nadav,
> > >
> > > Sorry, I guess I should be more clear. When I say target specific
> I'm
> > > talking about GPU targets. The get_global_id() implementation
> > > on the GPUs we've written a backend for (Evergreen, Northern
> Islands)
> > > requires reading values from special registers that are preloaded by
> the
> > > hardware. I'm guessing other GPUs do something similar, so I do
> think
> > > it is something that would need to be resolved in the backend.
> >
>
> Hi Peter,
>
> > Hi Tom,
> >
> > I disagree. There is no need to resolve these functions in the
> > backend, as the register reads can simply be made part of the
> > get_global_id implementation. The libclc OpenCL C standard library
> > already targets NVIDIA GPUs, and it implements get_global_id and the
> > other work-item functions in exactly this way:
> >
> > http://git.pcc.me.uk/?p=~peter/libclc.git;a=blob;f=ptx-
> nvidiacl/include/clc/workitem/get_global_id.h
> >
>
> I looked at this before and it seems like a good approach, but the one
> thing I couldn't figure out was: If I have clang embedded in my OpenCL
> implementation and it is using an out of tree backend, how do I get
> clang to recognize my target's builtins as valid and then map them to
> the appropriate target specific intrinsic?
>
[Villmow, Micah] Tom, this is the job of the math/language library. It should be the glue that connects the frontend language specific intrinsics and the backend target specific intrinsics.
For example, with the OpenSource AMDIL backend, we expose the function v4i32 __amdil_get_global_id_int() and OpenCL has size_t get_global_id(int i).
So the implementation is in the library and would be:
size_t get_global_id(int i)
{
v4i32 val = __amdil_get_global_id_int();
switch(i) {
default: return 1;
case 1: return val.y;
case 2: return val.z;
case 0: return val.x;
}
}
The goal of this aspect for the AMDIL backend has always been to expose what the target exposes and let the library/frontend map the language to the target.
Hope this helps,
Micah
> Thanks,
> Tom
>
> > Your way, we leak language-specific details into the backends, details
> > which can be dealt with by the frontend and standard library, so that
> > the backends can be kept language independent.
> >
> > Also, there is no one way of implementing functions like get_global_id
> > on (say) PTX. One thing that is missing from the get_global_id
> > implementation in libclc is global offsets. The precise details
> > of how those offsets are passed to the kernel have varied over
> > time (for example, at one point a global array was used, and now
> > special registers are used). If we encode these sorts of details
> > in the backend we reduce overall flexibility and deny optimisation
> > opportunities to the optimisers. For example, say the kernel
> computes:
> >
> > get_global_id(0) - get_global_offset(0)
> >
> > On PTX the optimisers should be able to eliminate the global offset
> > access altogether, but they might not be able to if get_global_id
> > and get_global_offset are opaque intrinsic calls.
> >
> > Thanks,
> > --
> > Peter
> >
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the cfe-commits
mailing list