[Libclc-dev] Any plan for OpenCL 1.2?

Jan Vesely via Libclc-dev libclc-dev at lists.llvm.org
Mon Jul 20 10:52:54 PDT 2020

On Mon, 2020-07-20 at 09:24 -0500, Aaron Watry via Libclc-dev wrote:
> On Sat, Jul 18, 2020, 11:53 PM DING, Yang via Libclc-dev <
> libclc-dev at lists.llvm.org> wrote:
> > Hi,
> > 
> > It seems libclc currently implements the library requirements of the
> > OpenCL C programming language, as specified by the OpenCL 1.1
> > Specification.
> > 
> > I am wondering if there is any active development or plan to upgrade
> > it to OpenCL 1.2? If not, what are the biggest challenges?
> > 
> I haven't checked in a while, but I think the biggest blocker at this point
> is that we still don't have a printf implementation in libclc.  Most/all of
> the rest of the required functions are already implemented to expose 1.2.
> I had started on a pure-C printf implementation a while back that would in
> theory be portable to devices printing to a local/global buffer, but
> stalled out on it when I got to printing vector arguments and hex-float
> formats.  Also, the fact that global atomics in CL aren't guaranteed to be
> synchronized across all work groups executing a kernel (just within a given
> workgroup for a given global buffer).

I don't think we need to worry about that. since both the amd and
nvptx atomics are atomic for all work groups we can just use that
behaviour. the actual atomic op would be target specific and if anyone
wants to add an additional target they add their own implementation
(SPIR-V can just use atomic with the right scope).
AMD targets can be switched to use GDS as an optimization later.

at least cl 1.2 printf only prints to stdout so we only need to
consider global memory.

> If someone wants to take a peek or keep going with it, I've uploaded my WIP
> code for the printf implementation here: https://github.com/awatry/printf

I'm not sure parsing the format string on the device is the best
approach as it will introduce quite a lot of divergence. it might be
easier/faster to just copy the format string and input data to the
buffer and let the host parse/print everything.

was the plan to:
1.) parse the input once to get the number of bytes
2.) atomic move writepointer
3.) parse the input second time and print characters to the buffer

or did you have anything more specialized in mind?


> It's probably horrible, and may have to be re-written from scratch to
> actually work on a GPU, but it may be a start :)
> Thanks,
> Aaron
> > Thanks,
> > Yang
> > _______________________________________________
> > Libclc-dev mailing list
> > Libclc-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
> > 
> _______________________________________________
> Libclc-dev mailing list
> Libclc-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev

More information about the Libclc-dev mailing list