[Libclc-dev] printf (Re: [PATCH 1/2] Make image builtins r600/llvm-3.9 only)
Jan Vesely via Libclc-dev
libclc-dev at lists.llvm.org
Tue Oct 10 15:00:04 PDT 2017
On Tue, 2017-10-10 at 22:58 +0200, Jeroen Ketema via Libclc-dev wrote:
> > On 10 Oct 2017, at 20:57, Aaron Watry via Libclc-dev <libclc-dev at lists.llvm.org> wrote:
> >
> > On Tue, Oct 10, 2017 at 1:16 PM, Jan Vesely <jan.vesely at rutgers.edu> wrote:
> > > On Tue, 2017-10-10 at 10:58 -0500, Aaron Watry via Libclc-dev wrote:
> > > > On Tue, Oct 10, 2017 at 9:25 AM, Jan Vesely <jan.vesely at rutgers.edu> wrote:
> > > > > On Mon, 2017-10-09 at 21:28 -0500, Aaron Watry via Libclc-dev wrote:
> > > > > > On Sun, Oct 8, 2017 at 1:23 PM, Jan Vesely via Libclc-dev
> > > > > > <libclc-dev at lists.llvm.org> wrote:
> > > > > > > On Sun, 2017-10-01 at 14:15 -0400, Jan Vesely wrote:
> > > > > > > > The implementation uses r600 specific intrinsics
> > > > > > > > LLVM-4 generates calls to functions using _ro_t and _rw_t image types
> > > > > > > > Portions of the code can be moved back as more targets/llvm versions add image support
> > > > > > > >
> > > > > > > > Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
> > > > > > > > ---
> > > > > > > > AMDGPUOpenCLImageTypeLoweringPass relies on old style of kernel metadata
> > > > > > > > passing, so I'm pretty sure this does not work in 3.9 either, but it at
> > > > > > > > least does not generate external calls.
> > > > > > > > I'd like to keep the code around as a reference when someone starts
> > > > > > > > resurrecting image builtins.
> > > > > > >
> > > > > > > ping. this is just a code cleanup. No change in functionality.
> > > > > >
> > > > > > Looks ok to me. I agree it would be nice to resurrect image support
> > > > > > in newer llvm, so lets keep this around for now.
> > > > >
> > > > > thanks. Does it apply to 2/2 as well?
> > > >
> > > > Yes, it does. I just manually tested the check_external_calls.sh
> > > > script against both r600 and pitcairn's built libs/aliases on LLVM
> > > > 6.0, and the image calls are now gone. I don't have a LLVM 3.9 install
> > > > around at the moment to test my BARTS with, unfortunately. Maybe I'll
> > > > build that release at some point and install it in its own prefix.
> > >
> > > oh, I should have mentioned that it's in the images branch here:
> > > https://travis-ci.org/jvesely/libclc/branches
> > >
> > > I was considering to ask the github llvm-mirror person to enable travis
> > > an forward email to the list, or maybe setup our own mirror with CI
> > > enabled. or maybe add libclc to official llvm build farm.
> > >
> > > >
> > > > >
> > > > > >
> > > > > > If nothing else, it might give me some ideas on how to make an
> > > > > > implicit global buffer available for the printf implementation.
> > > > >
> > > > > I don't think images are a good model for that. It'd be nicer to use
> > > > > standard implicit parameter passing on clover side and
> > > > > __buitltin_implicitarg_ptr on libclc side, without any llvm changes.
> > > > > (It'd be even nicer if NDRange arguments were converted first, so they
> > > > > are all together in one place)
> > > > >
> > > >
> > > > Yeah, I have been considering the implicit parameter passing that
> > > > clover does, and had been leaning towards attempting to do that with a
> > > > > =1MB buffer for printf output. The other thing that will probably be
> > > >
> > > > needed is an atomic counter to keep track of our current location in
> > > > that buffer when printing.
> > > >
> > > > I'm a fair bit of the way done with having a clean-slate pure-C
> > > > implementation of printf done that matches the CLC requirements that
> > > > I've just been testing for identical output against my system printf.
> > > > I'll probably post that as an RFC in its pure C form before converting
> > > > it to CLC code just to save myself a bit of refactoring effort later
> > > > if needed. Once that's done, I'll take a stab at converting it to
> > > > CLC, probably have to bug-fix clang to enable varargs function calls
> > > > specifically for printf (CL 2.0 spec, section 6.9 part e specifies
> > > > that variadic functions/macros are only allowed for
> > > > printf/enqueue_kernel), which I don't believe it allows yet, and then
> > > > I'll get the implicit parameter passing set up for clover (hopefully).
> > >
> > > I don't think you need to implement full printf in libclc. It should be
> > > enough to dump format string and arguments
>
> There’s not even a reason to dump the format string in the buffer, as per
> the OpenCL specification it’s known at compile time what the format
> string is going to be.
we could get away with a pointer to the format string. but it might be
more difficult for clover the figure out the pointers if we want to use
device pointers.
>
> > > into the buffer and let
> > > clover to the ugly part. printf code doesn't sound like something that
> > > would run efficiently on GPUs anyway.
> >
> > I considered forwarding the format spec and arguments to clover, but I
> > figured that we would need a custom printf implementation given that
> > CLC's printf supports the vector length specifier (and hl length) that
> > standard printf doesn't anyway, and that CLC doesn't support the full
> > set of printf specifiers that normal C allows.
agreed. it'd just be more efficient to implement custom printf in
clover than libclc (+you can use all the C++ help)
> >
> > The same comment from my previous mail about enabling varargs calls in
> > CLC for printf still would apply, at which point the real question is
>
> I don’t get the argument about varargs. Contrary to the situation on a host
> system both the compiler and printf library function are under your control
> here. Hence, you can use the compiler to get rid of any vararg function
> calls before they ever reach the hardware.
varargs should be handled by clang. based on
test/SemaOpenCL/builtin.cl, I'd expect that setting cl-1.2 should be
enough to get it working to get rid of clang complaints (we currently
build libclc using cl-1.0)
>
> > performance, and how we'd deal with sending variable length arguments
> > back to clover in a buffer (probably just a giant byte array?). I'm
> > not really convinced that printf is a performance critical function.
> > There may also be complexities involved in forwarding the printf
> > argument list back to clover in the cases where there are more
> > arguments supplied to the printf call than the format calls for.
> >
> > In this case, I figured that we at least needed a global buffer (and
> > output size variable) that was passed implicitly by clover to libclc
> > to dump the print output in, at which point I think we can just get
> > away with an atomically incremented/forwarded position variable for
> > where in the buffer the cursor currently is so that multiple threads
> > can output to the buffer in parallel.
>
> Be careful here: OpenCL 1.2 does not guarantee that atomic
> operations work across work-groups. Hence, you’re going
> to need a buffer per work-group.
Do you have a reference for this? CLC2 introduced scoped atomic
operations, but cl-1.0 says that "These transactions are atomic for the
device executing these atomic functions." which, I think can be
interpreted as device scope.
GCN atomics are definitely global (performed in L2 cache). afaik the
same is true for EG/CM, but I'd need to double check that color buffer
works as expected. I don't know about nvidia.
>
> Jeroen
>
> >
> > It honestly seems easier to do this in raw CLC than to pass the
> > contents back to clover to run through a custom printf function
> > anyway. But what I have is currently in C, so I guess we could
> > consider it...
you can even implement both a keep the one that performs better :)
Jan
> >
> > --Aaron
> >
> > >
> > > Jan
> > >
> > > >
> > > > --Aaron
> > > >
> > > > > Jan
> > > > >
> > > > > >
> > > > > > --Aaron
> > > > > >
> > > > > > >
> > > > > > > Jan
> > > > > > >
> > > > > > > >
> > > > > > > > Jan
>
> _______________________________________________
> Libclc-dev mailing list
> Libclc-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/libclc-dev/attachments/20171010/8eb0f47f/attachment.sig>
More information about the Libclc-dev
mailing list