[Libclc-dev] [PATCH] Add vstore_half helpers for ptx
Jan Vesely via Libclc-dev
libclc-dev at lists.llvm.org
Tue Oct 3 22:56:53 PDT 2017
On Tue, 2017-10-03 at 22:37 +0200, Jeroen Ketema via Libclc-dev wrote:
> > On 3 Oct 2017, at 22:20, Jan Vesely <jan.vesely at rutgers.edu> wrote:
> >
> > On Tue, 2017-10-03 at 22:04 +0200, Jeroen Ketema via Libclc-dev wrote:
> > > Hi Jan,
> > >
> > > > On 3 Oct 2017, at 21:57, Jan Vesely <jan.vesely at rutgers.edu> wrote:
> > > >
> > > > On Tue, 2017-10-03 at 20:26 +0200, Jeroen Ketema via Libclc-dev wrote:
> > > > > Index: ptx/lib/SOURCES_4.0
> > > > > ===================================================================
> > > > > --- ptx/lib/SOURCES_4.0 (nonexistent)
> > > > > +++ ptx/lib/SOURCES_4.0 (working copy)
> > > > > @@ -0,0 +1 @@
> > > > > +shared/vstore_half_helpers.ll
> > > > > Index: ptx/lib/SOURCES_5.0
> > > > > ===================================================================
> > > > > --- ptx/lib/SOURCES_5.0 (nonexistent)
> > > > > +++ ptx/lib/SOURCES_5.0 (working copy)
> > > >
> > > > you probably need SOURCES_3.9 as well.
> > > > or add a comment why it's not needed.
> > >
> > > Yes, if we’re supporting that. I don’t know what the current policy/status is?
> >
> > llvm-3.9 support was restored last week. I don't have a testing setup
> > but the generated library is sane (there's even a travis CI to check
> > that).
>
> Ok, I’ll add that. Was there any particular reason it was restored? In the past
> libclc generally only supported the tip of the Clang/LLVM tree.
couple of reasons:
I think it has always been mostly 'whatever is convenient' and adding
back one version was easy enough.
clover recently decided to support llvm-3.9+ (to match mesa's amdgpu
llvm requirements)
the last libclc version that worked with llvm-3.9 was rather
incomplete, especially now that we are within 30 functions of complete
1.1 (and afaik only printf for 1.2).
What llvm versions we want to support going forward is Tom's decision.
I think it'd be nice to stay aligned with clover.
>
> >
> > >
> > > >
> > > > > @@ -0,0 +1 @@
> > > > > +shared/vstore_half_helpers.ll
> > > > > Index: ptx/lib/shared/vstore_half_helpers.ll
> > > > > ===================================================================
> > > > > --- ptx/lib/shared/vstore_half_helpers.ll (nonexistent)
> > > > > +++ ptx/lib/shared/vstore_half_helpers.ll (working copy)
> > > >
> > > > can you add datalayout, or would it prevent sharing the file between
> > > > ptx and ptx64?
> > >
> > > That would prevent sharing. I wonder if we might come up with some kind of solution
> > > that adds the data layout at configuration or compilation time?
> >
> > I was considering passing .ll files through c preprocessor (clang -E
> > with the same options as .cl compilation), so that you can use things
> > like #ifdef and #include.
> > It should not be a lot of work, and it'd allow us to share .ll files
> > without linker complaints.
>
> I’m a bit hesitant about (ab)using the preprocessor. Although this is a simple
> use case, I’ve seen more advanced uses lead to problems in the past, e.g.,
> the ghc compiler at some point stopped working on MacOS, because
> clang’s preprocessor diverged from gcc’s.
we'd need some kind of preprocessing if we want to share the same file.
using something already available is the easiest way out.
the alternative is to have separate files per target.
I don't think it's worth spending much effort over, especially if .ll
files are mainly used to provide support for older llvm.
at any rate, this patch does not need to wait for any such solution.
>
> >
> > >
> > > > I assume GENERIC == private and SHARED == local?
> > >
> > > Correct.
> >
> > just out of curiosity. what's your use case for nvptx libclc?
>
> GPUVerify: http://multicore.doc.ic.ac.uk/tools/GPUVerify/
>
> We use clang to compile CUDA and OpenCL down to bitcode for the NVPTX
> target (linking with libclc in the latter case) before feeding it to the tool proper.
Interesting, nice to see the nvptx part being useful.
thanks for sharing.
I skipped nvptx when cleaning up the barrier builtin,
ptx-nvidiacl/lib/synchronization/barrier.cl looks broken in a way that
would interfere with at least barrier divergence analysis.
regards,
Jan
>
> Jeroen
>
> >
> > Jan
> >
> > >
> > > Thanks for the review,
> > >
> > > Jeroen
> > >
> > > >
> > > > other than that
> > > > Reviewed-by: Jan Vesely <jan.vesely at rutgers.edu <mailto:jan.vesely at rutgers.edu> <mailto:jan.vesely at rutgers.edu <mailto:jan.vesely at rutgers.edu>>>
> > > >
> > > > Jan
> > > >
> > > > > @@ -0,0 +1,35 @@
> > > > > +define void @__clc_vstore_half_float_helper__private(float %data, half addrspace(0)* nocapture %ptr) nounwind alwaysinline {
> > > > > + %res = fptrunc float %data to half
> > > > > + store half %res, half addrspace(0)* %ptr
> > > > > + ret void
> > > > > +}
> > > > > +
> > > > > +define void @__clc_vstore_half_float_helper__global(float %data, half addrspace(1)* nocapture %ptr) nounwind alwaysinline {
> > > > > + %res = fptrunc float %data to half
> > > > > + store half %res, half addrspace(1)* %ptr
> > > > > + ret void
> > > > > +}
> > > > > +
> > > > > +define void @__clc_vstore_half_float_helper__local(float %data, half addrspace(3)* nocapture %ptr) nounwind alwaysinline {
> > > > > + %res = fptrunc float %data to half
> > > > > + store half %res, half addrspace(3)* %ptr
> > > > > + ret void
> > > > > +}
> > > > > +
> > > > > +define void @__clc_vstore_half_double_helper__private(double %data, half addrspace(0)* nocapture %ptr) nounwind alwaysinline {
> > > > > + %res = fptrunc double %data to half
> > > > > + store half %res, half addrspace(0)* %ptr
> > > > > + ret void
> > > > > +}
> > > > > +
> > > > > +define void @__clc_vstore_half_double_helper__global(double %data, half addrspace(1)* nocapture %ptr) nounwind alwaysinline {
> > > > > + %res = fptrunc double %data to half
> > > > > + store half %res, half addrspace(1)* %ptr
> > > > > + ret void
> > > > > +}
> > > > > +
> > > > > +define void @__clc_vstore_half_double_helper__local(double %data, half addrspace(3)* nocapture %ptr) nounwind alwaysinline {
> > > > > + %res = fptrunc double %data to half
> > > > > + store half %res, half addrspace(3)* %ptr
> > > > > + ret void
> > > > > +}
> > > > >
> > > > > _______________________________________________
> > > > > Libclc-dev mailing list
> > > > > Libclc-dev at lists.llvm.org <mailto:Libclc-dev at lists.llvm.org>
> > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev>
> > > >
> > > > --
> > > > Jan Vesely <jan.vesely at rutgers.edu <mailto:jan.vesely at rutgers.edu> <mailto:jan.vesely at rutgers.edu <mailto:jan.vesely at rutgers.edu>>>
> > >
> > > _______________________________________________
> > > Libclc-dev mailing list
> > > Libclc-dev at lists.llvm.org <mailto:Libclc-dev at lists.llvm.org>
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev>
> >
> > --
> > Jan Vesely <jan.vesely at rutgers.edu <mailto:jan.vesely at rutgers.edu>>
>
> _______________________________________________
> Libclc-dev mailing list
> Libclc-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
--
Jan Vesely <jan.vesely at rutgers.edu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/libclc-dev/attachments/20171004/a8028d9d/attachment.sig>
More information about the Libclc-dev
mailing list