[Libclc-dev] Integration Question

Pete Couperus pjcoup at gmail.com
Sat Oct 22 11:14:16 PDT 2011


Hello,

I see, that gives me a better idea.  I was working on the builtin
function support, so these issues sound familiar :).

On Fri, Oct 21, 2011 at 10:04 AM, Peter Collingbourne <peter at pcc.me.uk>wrote:

>
> Firstly, the declarations of builtin functions.  Currently these live
> in header files in libclc's include directory, with target specific
> overrides possible by arranging the order of -I flags, and I intend to
> keep it this way.  Optionally, libclc may, as part of its compilation
> process, produce a precompiled header (.pch) file for each target for
> efficiency (reading one large serialised file is more efficient than
> reading and parsing several small files).
>
>
When you say "libclc...may produce a precompiled header...", do you
mean "one of the artifacts built with libclc is a .pch file"? (Just
clarifying).
This seems like a good idea, I haven't looked at how clang supports
.pch files.  Preliminarily, I was essentially creating a monolithic
"builtin.h" header with all of the prototypes which got inserted before
compiling the .cl files.  All of the tinkering I had done was with clang
embedded as a library, rather than executed as a separate process.
At a glance, pocl executes clang as a separate process, yes?



> Secondly, the implementation of builtin functions.  This is a tricky
> issue, mainly because we must support a wide variety of targets,
> some of which have space restrictions and cannot support a large
> runtime library contained in each executable, and we must support
> inlining for efficiency and because many targets (especially GPUs)
> require it.  Initially I thought that the solution to this would be
> to provide "static inline" function definitions in the header files.
> Unfortunately I have since realised that the situation is more
> complicated than that.  Some builtin implementations must be written
> in pure LLVM IR, because Clang currently lacks support for emitting
> the necessary instructions.  Some builtins use data, such as cosine
> tables, which we should not duplicate in every translation unit.
> As a consequence of this, the implementations of the builtins cannot
> live in the header file.
>

Instead, the solution shall be to provide a .bc file providing all
> of the builtin function implementations (similar to how you suggest
> above).  Clang's frontend will be modified to include support for
> lazily linking bitcode modules (so that only used functions will be
> loaded from the .bc and linked) before performing optimisations.
> Each global in the .bc providing the builtins (this includes the
> builtins themselves, plus any data they use) will use linkonce_odr
> linkage.  This linkage provides the same semantics as C++ "inline" --
> it permits inlining, and at most one copy of the global will appear
> in the final executable.
>
>
When you say clang's frontend, does llvm-ld have support for this?
I'm less familiar with some of the link-time optimization things that
have been done.  It seems that the bitcode modules could be linked
normally, and then a pass could be run to remove uncalled functions.


> You mentioned overloaded functions.  This is already handled by Clang's
> IR generator.  Any function marked with __attribute__((overloadable))
> will have its name mangled according to the Itanium C++ ABI name
> mangling rules.
>
>
Right, the overloaded functions are mangled.  What I meant is that in
Clover,
some builtins are not linked in, so when the LLVM JIT refs an unknown
function,
it calls an optional function resolver, which Clover also provides.  I
believe
that this resolver needs to understand the mangled name, rather than the
bare name.  If you look at the resolver, it currently doesn't deal with the
overloaded builtins.
http://cgit.freedesktop.org/~steckdenis/clover/tree/src/core/cpu/builtins.cpp:416

Some targets, as part of their ABI, require a specific set of external
> symbols to be present in every object file, and those symbols must
> appear exactly once (an example being the _global_block_offset and
> other symbols used by NVIDIA's OpenCL implementation).  The solution
> to this would be to provide those symbols in a separate .bc file.
> That file would serve a similar role to glibc's crt0.o, and would be
> linked into every final executable during the final link step.
>
>
Could you explain this a bit further?  I understand that some targets
may need other symbols.  That's ok.
I'm unclear as to what you mean by final executable.  If I have a
file.cl with a number of kernels and support functions, the OpenCL
runtime needs to be able to execute the kernels.  What executable
is coming into the picture?  Or do you mean "program"?


> How would clients use these artifacts?  Another feature of libclc
> will be that clients will not need to worry about any of this.
> The Clang driver will be taught to pass the necessary flags to the
> Clang frontend, and the intention is that a command line such as this:
>
> $ clang -target ptx32--nvidiacl -o file.ptx file.cl
>
>
So, I'm a little unclear as to what exactly this is going to produce.
file.ptx will have all of the .ptx assembly for all of the referenced
builtins,
so it can be assembled into the executable referenced above?

would just work -- the semantics of such a command line driver
> invocation would be equivalent to the invocation of a program which
> uses the OpenCL platform layer and runtime APIs to build an OpenCL C
> program with the given flags (excluding -target, -o and input files)
> using clCreateProgramWithSource and clBuildProgram, and then uses
> clGetProgramInfo to dump the binaries.  As a side effect of this, the
> implementation of clBuildProgram would be very simple -- it would only
> need to invoke the driver with a few command line options in addition
> to the flags provided by the user as a parameter to clBuildProgram.
>
>
Ok, this gives me more of an idea where you're headed.  Thanks for the
explanation.  Sounds great!


> Clang provides an API for invoking its driver (see the
> clang::createInvocationFromCommandLine function).  There may also be
> a small wrapper library for clBuildProgram implementations to use,
> to simplify the entire process.  This could be part of libclc or
> perhaps a separate project.
>
> Thanks,
> --
> Peter
>

Thank you for the detailed explanation.

Pete
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/libclc-dev/attachments/20111022/97d420a7/attachment-0001.html>


More information about the Libclc-dev mailing list