[PATCH] D60455: [SYCL] Add support for SYCL device attributes

Fri Apr 19 05:38:52 PDT 2019

Anastasia added a comment.

> SYCL is similar to OpenMP 5 for C++, where you use only C++ classes instead of #pragma. So it is quite C++-friendlier than OpenMP.

I am not sure what you mean by friendlier? OpenMP concept is very clear to be - a program written in C/C++ or Fortran can be complimented with simple compiler directives to instruct the compiler about the parallelization.  Hence exactly the same program can be  used on sequential or parallel architectures. I can't imagine however anyone would use SYCL program on a non-parallel architecture? And this is where it is fundamentally different concept to me than C++ that has very different execution model (using very explicit language constructs for parallelism btw!).

To me SYCL dictates how program is to be written with explicit parallelism constructs using a special language. The fact that the language doesn't use different syntax from standard C++ at the moment doesn't mean that it's not there at least implicitly. If you would be able to just reuse C++ it would be perfectly a library style language but since you need language extensions to the compiler it isn't just a pure C++ library to me.

> But that means also there is not the same concept of explicit kernel like in OpenCL or CUDA. In OpenCL or CUDA, when there is a function with a specific attribute, you know it is a kernel and you can compile as such.

I am very confused, because if you don't need an explicit kernel construct why are you adding it here at all? The fact that you don't provide the documentation for it in the spec but yet add it as an explicit attribute in the language to allow implementing the feature does show that it is actually explicitly required. It is just well hidden behind the C++ library syntax that however requires activating features that aren't part of ISO C++. Perhaps I am still missing something but I am just worried that we are going to end up with a language that pretends to be a C++ library. I certainly see that CUDA or OpenCL could just add a layer of C++ libraries on top of their language extensions to provide the same functionality. So I still feel SYCL is closer to CUDA than to pure C++.

> In SYCL or OpenMP you need an outliner that will estimate what should be executed as an heterogeneous kernel, split the code between the host side and the device side, add some glue/stub to implement an RPC between the host and the device, manage potentially some allocation/memory transfers, etc.

But in SYCL this is requested explicitly in the source code using language constructs, isn't it?

> This is quite more complex than compiling OpenCL, CUDA or other graphics shader languages.

I think CUDA still does fair bit of similar logic what you describe above though.

> This is also why, while SYCL is technically pure standard C++, you need some specific compiler magic to do the code massaging to have everything working well between a host and some devices.

This patch is actually extending pure standard C++ to make it less pure. There is nothing magic about it.

> The attribute we discuss here is just an implementation detail to help the coordination between the compiler and the SYCL frontend classes to mark some area to outline, without relying to do some precise pattern matching, allowing more flexibility in the runtime without changing the compiler every time. So while it defines a zone to be outlined as a kernel, it is not really a kernel in the sense of OpenCL.

Can you give some concrete examples of why device outlined functions can't be an OpenCL kernel or functions? What functionality (apart from kernel templates) wouldn't be applicable?

> In triSYCL I made some completely different choices, using late outlining in LLVM and detecting some specific functions such as `cl::sycl::detail::instantiate_kernel<KernelName>()` that defines some stuff I want to outline https://github.com/triSYCL/triSYCL/blob/master/doc/architecture.rst#low-level-view-of-the-device-compiler-workflow
>  For me an attribute was not an option because I wanted to change Clang as little as possible.

Why not to continue this approach? What limitation does it have? Is it something that demonstrates that the language extension is the real solution to this?

> But at the end, I think it is quite more brittle than doing early outlining in Clang as discussed here, which also requires quite more knowledge of Clang than I have. :-)
> 
> So at the end, I think we should use a different keyword from OpenCL or CUDA because the semantics is different.

Overall, I see that this work is now going into a very different direction from what was written in the original RFC.
https://lists.llvm.org/pipermail/cfe-dev/2019-January/060811.html

It was suggested that SYCL builds on top of OpenCL and therefore most of functionality can be reused. May be the best approach is to restart the RFC making the new intent and the overall concept very clear, especially the fact that you are going to add a number of extensions to C++ language. I think C++ developers should be aware that is is going to happen and they can also help you further with a guidance and align on a general C++ development flow.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60455/new/

https://reviews.llvm.org/D60455