[Openmp-dev] Experimental SPIR-V back-end using OpenCL 2.1

Fri Oct 20 14:19:25 PDT 2017

Sorry, Intel OpenCL runtime was a little imprecise.
I meant the Intel® SDK for OpenCL™ Applications 2017<https://software.intel.com/en-us/whats-new-code-builder-2017-r1> for CPU only. (I just see there has been a new version)
For integrated GPUs, there is Beignet with, I think, OpenCL 2.0 support. I don’t know when OpenCL 2.1 will be supported,
but from then on, this should probably work. I don’t have an Apple Computer, but they are stuck with OpenCL 1.2 it seems.

Kind regards,
Daniel

Von: Jeff Hammond<mailto:jeff.science at gmail.com>
Gesendet: Freitag, 20. Oktober 2017 22:05
An: Schürmann, Daniel<mailto:daniel.schuermann at campus.tu-berlin.de>
Cc: Jonas Hahnfeld<mailto:hahnjo at hahnjo.de>; openmp-dev at lists.llvm.org<mailto:openmp-dev at lists.llvm.org>
Betreff: Re: [Openmp-dev] Experimental SPIR-V back-end using OpenCL 2.1

"Unfortunately, the Intel OpenCL runtime started to segfault with the change to generic address space."

Does this mean that it works (or at least worked) prior to this change?  Lots of folks would love to be able to use OpenMP 4.5 with Intel OpenCL for the integrated GPUs in Intel processors.

Have you tried Apple OpenCL?  That's another very popular implementation where OpenMP 4.5 would be highly desirable.

Jeff

On Fri, Oct 20, 2017 at 9:40 AM, Daniel Schürmann via Openmp-dev <openmp-dev at lists.llvm.org<mailto:openmp-dev at lists.llvm.org>> wrote:
Hi Jonas,

yes, other OpenCL runtimes are blocked out at the moment.
I would like to just test on "OpenCL 2.1", but that would block the AMD runtime. (I hope, this will change in future)
It is the proprietary AMD OpenCL SDK, part of the AMDGPU-Pro driver. ROCm states to only support OpenCL 1.2 runtime, but it might be worth a test.

Part of SPIRV-Tools is a validator, but this may be too much overhead. You mean checking the magic number? Good idea!

Thanks,
Daniel

On 10/20/2017 06:23 PM, Jonas Hahnfeld wrote:
Hi Daniel,

interesting work!

Two question:
 - With the latest commit, you perform a strncmp with "OpenCL 2.0 AMD". Does this mean that all other OpenCL implementations are effectively blocked out?
 - Is this the proprietary AMD OpenCL SDK or the "new" ROCm stack? https://rocm.github.io/

One remark: For __tgt_rtl_is_valid_binary: Does SPIR-V have its own machine id? That's how the CUDA plugin detects compatible binaries...

Cheers,
Jonas

Am 2017-10-20 11:24, schrieb Daniel Schürmann via Openmp-dev:
Thanks for all the help!

These benchmarks, especially lulesh, showed me that my approach of
setting the address spaces manually according to their scope doesn't
work(*) and that I have to use the generic address space
like the nvptx backend does.

Now with this much more robust version, I decided to make my project
public:

My clang fork is available at
https://github.com/daniel-schuermann/clang

and the openmp runtime fork at
https://github.com/daniel-schuermann/openmp

and the necessary llvm fork (for generating SPIR-V) is from
https://github.com/thewilsonator/llvm/tree/compute

All 3 of them might need a pull from upstream as they are not always
synced.

libomptarget-spir needs an OpenCL runtime which supports SPIRV kernels
(with OpenCL 2.1 headers).
Unfortunately, the Intel OpenCL runtime started to segfault with the
change to generic address space.
Therefore, the only working OpenCL runtime I know of is AMDGPU-Pro.

The follwing pragmas should work for now:
#target (enter/exit data)
#teams
#distribute / parallel for
#master
#barrier

as well as the clauses:
map, shared, private, firstprivate, lastprivate, schedule

A device runtime is not planned for the moment (I also don't know, how
to compile and link OpenCL sources into libomptarget), but the
generated code works independently from runtime functions.

For those more adventurous, you can (try to) build your sources with
-fopenmp -fopenmp-targets=spir64-unknown-unknown
For C++, I recommend to add -fno-exceptions and -O0 as there seem to
be optimizer passes enabled
which don't work for spir.

I would be very thankful for some feedback (I hope, it won't get too
depressing).
Although I am not able to accept pull requests at the moment for legal
purposes,
I would welcome any hint to make the implementation more robust and
complete
as well as statements about (not) working programs.
(no comments on code style please, this can be fixed later °°)

Kind regards,
Daniel

(*) While this is a valid program snipped,

#pragma omp target map(to:a[0:n])
{ int * b = a; }

it doesn't work if |a| is a pointer to addrSpace(1) (cl_global) and
|b| gets allocated as pointer to cl_private.

_______________________________________________
Openmp-dev mailing list
Openmp-dev at lists.llvm.org<mailto:Openmp-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev

--
Jeff Hammond
jeff.science at gmail.com<mailto:jeff.science at gmail.com>
http://jeffhammond.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20171020/239a394c/attachment-0001.html>