[Libclc-dev] [PATCH 1/2] Make image builtins r600/llvm-3.9 only
Aaron Watry via Libclc-dev
libclc-dev at lists.llvm.org
Tue Oct 10 11:57:25 PDT 2017
On Tue, Oct 10, 2017 at 1:16 PM, Jan Vesely <jan.vesely at rutgers.edu> wrote:
> On Tue, 2017-10-10 at 10:58 -0500, Aaron Watry via Libclc-dev wrote:
>> On Tue, Oct 10, 2017 at 9:25 AM, Jan Vesely <jan.vesely at rutgers.edu> wrote:
>> > On Mon, 2017-10-09 at 21:28 -0500, Aaron Watry via Libclc-dev wrote:
>> > > On Sun, Oct 8, 2017 at 1:23 PM, Jan Vesely via Libclc-dev
>> > > <libclc-dev at lists.llvm.org> wrote:
>> > > > On Sun, 2017-10-01 at 14:15 -0400, Jan Vesely wrote:
>> > > > > The implementation uses r600 specific intrinsics
>> > > > > LLVM-4 generates calls to functions using _ro_t and _rw_t image types
>> > > > > Portions of the code can be moved back as more targets/llvm versions add image support
>> > > > >
>> > > > > Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
>> > > > > ---
>> > > > > AMDGPUOpenCLImageTypeLoweringPass relies on old style of kernel metadata
>> > > > > passing, so I'm pretty sure this does not work in 3.9 either, but it at
>> > > > > least does not generate external calls.
>> > > > > I'd like to keep the code around as a reference when someone starts
>> > > > > resurrecting image builtins.
>> > > >
>> > > > ping. this is just a code cleanup. No change in functionality.
>> > >
>> > > Looks ok to me. I agree it would be nice to resurrect image support
>> > > in newer llvm, so lets keep this around for now.
>> >
>> > thanks. Does it apply to 2/2 as well?
>>
>> Yes, it does. I just manually tested the check_external_calls.sh
>> script against both r600 and pitcairn's built libs/aliases on LLVM
>> 6.0, and the image calls are now gone. I don't have a LLVM 3.9 install
>> around at the moment to test my BARTS with, unfortunately. Maybe I'll
>> build that release at some point and install it in its own prefix.
>
> oh, I should have mentioned that it's in the images branch here:
> https://travis-ci.org/jvesely/libclc/branches
>
> I was considering to ask the github llvm-mirror person to enable travis
> an forward email to the list, or maybe setup our own mirror with CI
> enabled. or maybe add libclc to official llvm build farm.
>
>>
>> >
>> > >
>> > > If nothing else, it might give me some ideas on how to make an
>> > > implicit global buffer available for the printf implementation.
>> >
>> > I don't think images are a good model for that. It'd be nicer to use
>> > standard implicit parameter passing on clover side and
>> > __buitltin_implicitarg_ptr on libclc side, without any llvm changes.
>> > (It'd be even nicer if NDRange arguments were converted first, so they
>> > are all together in one place)
>> >
>>
>> Yeah, I have been considering the implicit parameter passing that
>> clover does, and had been leaning towards attempting to do that with a
>> > =1MB buffer for printf output. The other thing that will probably be
>>
>> needed is an atomic counter to keep track of our current location in
>> that buffer when printing.
>>
>> I'm a fair bit of the way done with having a clean-slate pure-C
>> implementation of printf done that matches the CLC requirements that
>> I've just been testing for identical output against my system printf.
>> I'll probably post that as an RFC in its pure C form before converting
>> it to CLC code just to save myself a bit of refactoring effort later
>> if needed. Once that's done, I'll take a stab at converting it to
>> CLC, probably have to bug-fix clang to enable varargs function calls
>> specifically for printf (CL 2.0 spec, section 6.9 part e specifies
>> that variadic functions/macros are only allowed for
>> printf/enqueue_kernel), which I don't believe it allows yet, and then
>> I'll get the implicit parameter passing set up for clover (hopefully).
>
> I don't think you need to implement full printf in libclc. It should be
> enough to dump format string and arguments into the buffer and let
> clover to the ugly part. printf code doesn't sound like something that
> would run efficiently on GPUs anyway.
I considered forwarding the format spec and arguments to clover, but I
figured that we would need a custom printf implementation given that
CLC's printf supports the vector length specifier (and hl length) that
standard printf doesn't anyway, and that CLC doesn't support the full
set of printf specifiers that normal C allows.
The same comment from my previous mail about enabling varargs calls in
CLC for printf still would apply, at which point the real question is
performance, and how we'd deal with sending variable length arguments
back to clover in a buffer (probably just a giant byte array?). I'm
not really convinced that printf is a performance critical function.
There may also be complexities involved in forwarding the printf
argument list back to clover in the cases where there are more
arguments supplied to the printf call than the format calls for.
In this case, I figured that we at least needed a global buffer (and
output size variable) that was passed implicitly by clover to libclc
to dump the print output in, at which point I think we can just get
away with an atomically incremented/forwarded position variable for
where in the buffer the cursor currently is so that multiple threads
can output to the buffer in parallel.
It honestly seems easier to do this in raw CLC than to pass the
contents back to clover to run through a custom printf function
anyway. But what I have is currently in C, so I guess we could
consider it...
--Aaron
>
> Jan
>
>>
>> --Aaron
>>
>> > Jan
>> >
>> > >
>> > > --Aaron
>> > >
>> > > >
>> > > > Jan
>> > > >
>> > > > >
>> > > > > Jan
>> > > > >
>> > > > > amdgpu/lib/SOURCES | 14 --------------
>> > > > > generic/lib/SOURCES | 1 -
>> > > > > r600/lib/SOURCES_3.9 | 15 +++++++++++++++
>> > > > > {amdgpu => r600}/lib/image/get_image_attributes_impl.ll | 0
>> > > > > {amdgpu => r600}/lib/image/get_image_channel_data_type.cl | 0
>> > > > > {amdgpu => r600}/lib/image/get_image_channel_order.cl | 0
>> > > > > {amdgpu => r600}/lib/image/get_image_depth.cl | 0
>> > > > > {generic => r600}/lib/image/get_image_dim.cl | 0
>> > > > > {amdgpu => r600}/lib/image/get_image_height.cl | 0
>> > > > > {amdgpu => r600}/lib/image/get_image_width.cl | 0
>> > > > > {amdgpu => r600}/lib/image/read_image_impl.ll | 0
>> > > > > {amdgpu => r600}/lib/image/read_imagef.cl | 0
>> > > > > {amdgpu => r600}/lib/image/read_imagei.cl | 0
>> > > > > {amdgpu => r600}/lib/image/read_imageui.cl | 0
>> > > > > {amdgpu => r600}/lib/image/write_image_impl.ll | 0
>> > > > > {amdgpu => r600}/lib/image/write_imagef.cl | 0
>> > > > > {amdgpu => r600}/lib/image/write_imagei.cl | 0
>> > > > > {amdgpu => r600}/lib/image/write_imageui.cl | 0
>> > > > > 18 files changed, 15 insertions(+), 15 deletions(-)
>> > > > > create mode 100644 r600/lib/SOURCES_3.9
>> > > > > rename {amdgpu => r600}/lib/image/get_image_attributes_impl.ll (100%)
>> > > > > rename {amdgpu => r600}/lib/image/get_image_channel_data_type.cl (100%)
>> > > > > rename {amdgpu => r600}/lib/image/get_image_channel_order.cl (100%)
>> > > > > rename {amdgpu => r600}/lib/image/get_image_depth.cl (100%)
>> > > > > rename {generic => r600}/lib/image/get_image_dim.cl (100%)
>> > > > > rename {amdgpu => r600}/lib/image/get_image_height.cl (100%)
>> > > > > rename {amdgpu => r600}/lib/image/get_image_width.cl (100%)
>> > > > > rename {amdgpu => r600}/lib/image/read_image_impl.ll (100%)
>> > > > > rename {amdgpu => r600}/lib/image/read_imagef.cl (100%)
>> > > > > rename {amdgpu => r600}/lib/image/read_imagei.cl (100%)
>> > > > > rename {amdgpu => r600}/lib/image/read_imageui.cl (100%)
>> > > > > rename {amdgpu => r600}/lib/image/write_image_impl.ll (100%)
>> > > > > rename {amdgpu => r600}/lib/image/write_imagef.cl (100%)
>> > > > > rename {amdgpu => r600}/lib/image/write_imagei.cl (100%)
>> > > > > rename {amdgpu => r600}/lib/image/write_imageui.cl (100%)
>> > > > >
>> > > > > diff --git a/amdgpu/lib/SOURCES b/amdgpu/lib/SOURCES
>> > > > > index 4414621..ce5fe66 100644
>> > > > > --- a/amdgpu/lib/SOURCES
>> > > > > +++ b/amdgpu/lib/SOURCES
>> > > > > @@ -1,16 +1,2 @@
>> > > > > math/nextafter.cl
>> > > > > math/sqrt.cl
>> > > > > -image/get_image_width.cl
>> > > > > -image/get_image_height.cl
>> > > > > -image/get_image_depth.cl
>> > > > > -image/get_image_channel_data_type.cl
>> > > > > -image/get_image_channel_order.cl
>> > > > > -image/get_image_attributes_impl.ll
>> > > > > -image/read_imagef.cl
>> > > > > -image/read_imagei.cl
>> > > > > -image/read_imageui.cl
>> > > > > -image/read_image_impl.ll
>> > > > > -image/write_imagef.cl
>> > > > > -image/write_imagei.cl
>> > > > > -image/write_imageui.cl
>> > > > > -image/write_image_impl.ll
>> > > > > diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES
>> > > > > index f919bc7..12f1271 100644
>> > > > > --- a/generic/lib/SOURCES
>> > > > > +++ b/generic/lib/SOURCES
>> > > > > @@ -167,4 +167,3 @@ shared/vload.cl
>> > > > > shared/vstore.cl
>> > > > > workitem/get_global_id.cl
>> > > > > workitem/get_global_size.cl
>> > > > > -image/get_image_dim.cl
>> > > > > diff --git a/r600/lib/SOURCES_3.9 b/r600/lib/SOURCES_3.9
>> > > > > new file mode 100644
>> > > > > index 0000000..a44a9ce
>> > > > > --- /dev/null
>> > > > > +++ b/r600/lib/SOURCES_3.9
>> > > > > @@ -0,0 +1,15 @@
>> > > > > +image/get_image_dim.cl
>> > > > > +image/get_image_width.cl
>> > > > > +image/get_image_height.cl
>> > > > > +image/get_image_depth.cl
>> > > > > +image/get_image_channel_data_type.cl
>> > > > > +image/get_image_channel_order.cl
>> > > > > +image/get_image_attributes_impl.ll
>> > > > > +image/read_imagef.cl
>> > > > > +image/read_imagei.cl
>> > > > > +image/read_imageui.cl
>> > > > > +image/read_image_impl.ll
>> > > > > +image/write_imagef.cl
>> > > > > +image/write_imagei.cl
>> > > > > +image/write_imageui.cl
>> > > > > +image/write_image_impl.ll
>> > > > > diff --git a/amdgpu/lib/image/get_image_attributes_impl.ll b/r600/lib/image/get_image_attributes_impl.ll
>> > > > > similarity index 100%
>> > > > > rename from amdgpu/lib/image/get_image_attributes_impl.ll
>> > > > > rename to r600/lib/image/get_image_attributes_impl.ll
>> > > > > diff --git a/amdgpu/lib/image/get_image_channel_data_type.cl b/r600/lib/image/get_image_channel_data_type.cl
>> > > > > similarity index 100%
>> > > > > rename from amdgpu/lib/image/get_image_channel_data_type.cl
>> > > > > rename to r600/lib/image/get_image_channel_data_type.cl
>> > > > > diff --git a/amdgpu/lib/image/get_image_channel_order.cl b/r600/lib/image/get_image_channel_order.cl
>> > > > > similarity index 100%
>> > > > > rename from amdgpu/lib/image/get_image_channel_order.cl
>> > > > > rename to r600/lib/image/get_image_channel_order.cl
>> > > > > diff --git a/amdgpu/lib/image/get_image_depth.cl b/r600/lib/image/get_image_depth.cl
>> > > > > similarity index 100%
>> > > > > rename from amdgpu/lib/image/get_image_depth.cl
>> > > > > rename to r600/lib/image/get_image_depth.cl
>> > > > > diff --git a/generic/lib/image/get_image_dim.cl b/r600/lib/image/get_image_dim.cl
>> > > > > similarity index 100%
>> > > > > rename from generic/lib/image/get_image_dim.cl
>> > > > > rename to r600/lib/image/get_image_dim.cl
>> > > > > diff --git a/amdgpu/lib/image/get_image_height.cl b/r600/lib/image/get_image_height.cl
>> > > > > similarity index 100%
>> > > > > rename from amdgpu/lib/image/get_image_height.cl
>> > > > > rename to r600/lib/image/get_image_height.cl
>> > > > > diff --git a/amdgpu/lib/image/get_image_width.cl b/r600/lib/image/get_image_width.cl
>> > > > > similarity index 100%
>> > > > > rename from amdgpu/lib/image/get_image_width.cl
>> > > > > rename to r600/lib/image/get_image_width.cl
>> > > > > diff --git a/amdgpu/lib/image/read_image_impl.ll b/r600/lib/image/read_image_impl.ll
>> > > > > similarity index 100%
>> > > > > rename from amdgpu/lib/image/read_image_impl.ll
>> > > > > rename to r600/lib/image/read_image_impl.ll
>> > > > > diff --git a/amdgpu/lib/image/read_imagef.cl b/r600/lib/image/read_imagef.cl
>> > > > > similarity index 100%
>> > > > > rename from amdgpu/lib/image/read_imagef.cl
>> > > > > rename to r600/lib/image/read_imagef.cl
>> > > > > diff --git a/amdgpu/lib/image/read_imagei.cl b/r600/lib/image/read_imagei.cl
>> > > > > similarity index 100%
>> > > > > rename from amdgpu/lib/image/read_imagei.cl
>> > > > > rename to r600/lib/image/read_imagei.cl
>> > > > > diff --git a/amdgpu/lib/image/read_imageui.cl b/r600/lib/image/read_imageui.cl
>> > > > > similarity index 100%
>> > > > > rename from amdgpu/lib/image/read_imageui.cl
>> > > > > rename to r600/lib/image/read_imageui.cl
>> > > > > diff --git a/amdgpu/lib/image/write_image_impl.ll b/r600/lib/image/write_image_impl.ll
>> > > > > similarity index 100%
>> > > > > rename from amdgpu/lib/image/write_image_impl.ll
>> > > > > rename to r600/lib/image/write_image_impl.ll
>> > > > > diff --git a/amdgpu/lib/image/write_imagef.cl b/r600/lib/image/write_imagef.cl
>> > > > > similarity index 100%
>> > > > > rename from amdgpu/lib/image/write_imagef.cl
>> > > > > rename to r600/lib/image/write_imagef.cl
>> > > > > diff --git a/amdgpu/lib/image/write_imagei.cl b/r600/lib/image/write_imagei.cl
>> > > > > similarity index 100%
>> > > > > rename from amdgpu/lib/image/write_imagei.cl
>> > > > > rename to r600/lib/image/write_imagei.cl
>> > > > > diff --git a/amdgpu/lib/image/write_imageui.cl b/r600/lib/image/write_imageui.cl
>> > > > > similarity index 100%
>> > > > > rename from amdgpu/lib/image/write_imageui.cl
>> > > > > rename to r600/lib/image/write_imageui.cl
>> > > >
>> > > > --
>> > > > Jan Vesely <jan.vesely at rutgers.edu>
>> > > > _______________________________________________
>> > > > Libclc-dev mailing list
>> > > > Libclc-dev at lists.llvm.org
>> > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
>> > > >
>> > >
>> > > _______________________________________________
>> > > Libclc-dev mailing list
>> > > Libclc-dev at lists.llvm.org
>> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
>>
>> _______________________________________________
>> Libclc-dev mailing list
>> Libclc-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
More information about the Libclc-dev
mailing list