[cfe-dev] [LLVMdev] SPIR provisional specification is now available in the Khronos website

Pekka Jääskeläinen pekka.jaaskelainen at tut.fi
Mon Sep 24 04:41:18 PDT 2012

Hi all,

Another OpenCL C implementation issue I'm currently fighting with is how
to best implement the automatic __local variables. Seems SPIR enforces
the current Clang implementation of them that converts the automatic
locals to C function static variables (thus, in practice global variables).

Clearly, this is not a thread safe "final implementation", thus works as is
only when multiple work groups of the same kernel are not executed in 
parallel. Therefore, some other compiler pass is assumed to convert those
function static (module global variables) to some other storage where the
local buffers are allocated per work group thread.

The pocl implementation is what was suggested some time ago in this list:
the locals are converted to local arguments to the kernel function which
are then passed per-thread buffers when the work group is executed. Thus,
pocl needs to convert the references to these dummy globals to local
buffer pointers at the end of the kernel function argument list.

The problem from the use of the "semantically inadequate" 'function
static' variables for the local buffers is caused by LLVM/Clang thinking
they are buffers with a constant base which they eventually won't be in
a parallel WG implementation. This triggers an issue I'm currently working on 
pocl: https://bugs.launchpad.net/pocl/+bug/1032203 because Clang generates
constant GEPs for the local buffer accesses (even though in a parallel
thread-safe implementation the local variables cannot be stored to
constant locations).

So, I wonder if this piece of SPIR specs might cause other similar
problems (LLVM optimizing incorrectly due to the slightly wrong semantics)
in the future and should be improved. The minimal fix would be
to add some kind of attribute to the function static global that prevents
Clang/LLVM thinking the address is constant and apply optimizations that rely
on that. Semantically the local buffer is actually a thread-local variable.
Are thread locals somehow supported in LLVM IR?

On 09/13/2012 12:19 PM, Pekka Jääskeläinen wrote:
> For what it's worth, this issue manifests itself in an unsolved pocl
> bug: https://bugs.launchpad.net/pocl/+bug/987905
> It would be simpler to implement a portable implementation for calling the
> kernel from the host if we could assume the kernel calling convention
> mapped
> each OpenCL setArg arg to a single kernel function arg (and preferably all
> arg data in memory). For the non-kernel functions it should not matter and
> could be target-specific.


More information about the cfe-dev mailing list