[cfe-dev] OpenCL support - using metadata

David Neto dneto.llvm at gmail.com
Mon Mar 7 08:01:41 PST 2011


2011/3/4 Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi>:
> OK,
>
> I think I got the basic idea...
>
> On 03/04/2011 04:51 PM, David Neto wrote:
>>
>> relocatable section.  Accesses are generated as offsets from a base
>> pointer.  (You can discard the address space number at this point!)
>> When running multiple work groups in parallel, the different work
>> groups are given different values for the base pointer.  That is what
>> keeps the work groups from stomping on each other's data.
>
> Going into practical details a bit to ensure I understood this
> correctly. Say, one implements an OpenCL kernel launcher that scales
> to the number of cores at runtime and implements the actual threading,
> for example, with pthreads or some other lower level threading API.
>
> Thus, when launching N parallel work group (WG) execution to
> utilize N cores, one must
>
> 1) allocate space for the local variables for each
> parallel WG thread and
> 2) somehow pass a pointer to this space as a base value to
> the launched kernel so the WG threads do not overwrite each other's
> data.
>
> Part 2) is a bit unclear to me. Is the base pointer added as an
> additional parameter to the kernel function which the "launcher" can
> use? Or do you assume the kernel is loaded to the host program as
> a runtime lib and assume the linker does the allocation via
> its relocation functionality? Thus, to launch N WG threads one needs
> to load the dynlib N times?

I haven't gone through the details of an implementation for (2), but to me
the most clear implementation is to add an implicit parameter to each
kernel to pass the base pointer.

But it's an implementation choice for the backend/runtime system.

And just to be clear:  The __local storage is shared between possibly multiple
work item threads in each work group.  For example, if your work group is of
size 16, then those 16 threads will simultaneously use the same base pointer
for their __local storage.  (For example, that's why it's meaningful to have
barrier(CLK_LOCAL_MEM_FENCE).)

>
> Thanks,
> --
> Pekka
>

cheers,
david




More information about the cfe-dev mailing list