[PATCH] D60455: [SYCL] Implement SYCL device code outlining

Tue May 28 03:41:33 PDT 2019

Anastasia added a comment.

In D60455#1518178 <https://reviews.llvm.org/D60455#1518178>, @bader wrote:

> In D60455#1513499 <https://reviews.llvm.org/D60455#1513499>, @Anastasia wrote:
>
> > //**Design question**//: since you are not aware what functions are to be run on a device while parsing them, at what point do you plan to diagnose the invalid behavior from the standard C++ i.e. using function pointers in kernel code is likely to cause issues?
>
>
> We are going to use DeviceDiagBuilder and related infrastructure implemented in Clang to diagnose device side code errors in OpenMP/CUDA modes.
>  More details are in the comments here:
>  https://clang.llvm.org/doxygen/classclang_1_1Sema_1_1DeviceDiagBuilder.html#details

Just a thought, if you parse host code first and provide the device outlining information to the device compilation phase would you then be able to reuse more parsing functionality from OpenCL?

> 
> 
>> Also do you need to outline the data structures too? For example classes used on device are not allowed to have virtual function.
> 
> Yes. This restriction is already implemented in our code base on GitHub.

Cool, is it implemented in `SemaSYCL.cpp` too?

================
Comment at: clang/include/clang/Basic/Attr.td:1017
+  let LangOpts = [SYCL];
+  let Documentation = [Undocumented];
+}
----------------
bader wrote:
> Anastasia wrote:
> > Anastasia wrote:
> > > Fznamznon wrote:
> > > > bader wrote:
> > > > > Anastasia wrote:
> > > > > > Ok, I thought the earlier request was not to add undocumented attributes with the spelling?
> > > > > > 
> > > > > > Also did `__kernel` attribute not work at the end?
> > > > > > 
> > > > > > I can't quite get where the current disconnect comes from but I find it extremely unhelpful.
> > > > > Hi @Anastasia, let me try to help.
> > > > > 
> > > > > > Ok, I thought the earlier request was not to add undocumented attributes with the spelling?
> > > > > 
> > > > > Right. @Fznamznon, could you document `sycl_kernel` attribute, please?
> > > > > 
> > > > > > Also did __kernel attribute not work at the end?
> > > > > 
> > > > > Maria left a comment with the summary of our experiment: https://reviews.llvm.org/D60455#1472705. There is a link to pull request, where @keryell and @agozillon expressed preference to have separate SYCL attributes. Let me copy their feedback here:
> > > > > 
> > > > > @keryell :
> > > > > 
> > > > > > Thank you for the experiment.
> > > > > > That looks like a straight forward change.
> > > > > > The interesting part is that it does not expose any advantage from reusing OpenCL __kernel marker.... So I am not more convinced that it is the way to go, because we would use any other keyword or attribute and it would be the same...
> > > > > > 
> > > > > 
> > > > > @agozillon :
> > > > > 
> > > > > > Just my two cents, I think a separation of concerns and having separate attributes will simplify things long-term.
> > > > > > 
> > > > > > While possibly similar just now, the SYCL specification is evolving and may end up targeting more than just OpenCL. So the semantics of the attributes may end up being quite different, even if at the moment the SYCL attribute is there mostly just to mark something for outlining.
> > > > > > 
> > > > > > If it doesn't then the case for refactoring and merging them in a future patch could be brought up again.
> > > > > 
> > > > > To summarize: we don't have good arguments to justify re-use of OpenCL `__kernel` keyword for SYCL mode requested by @aaron.ballman here https://reviews.llvm.org/D60455#1469150.
> > > > > 
> > > > > > I can't quite get where the current disconnect comes from but I find it extremely unhelpful.
> > > > > 
> > > > > Let me know how I can help here.
> > > > > 
> > > > > Additional note. I've submitted initial version of SYCL compiler design document to the GItHub: https://github.com/intel/llvm/blob/sycl/sycl/doc/SYCL_compiler_and_runtime_design.md. Please, take a look and let me know if you have questions.
> > > > >> Ok, I thought the earlier request was not to add undocumented attributes with the spelling?
> > > > > Right. @Fznamznon, could you document sycl_kernel attribute, please?
> > > > 
> > > > Do we really need add documentation for attribute which is not presented in SYCL spec and used for internal implementation details only because it has spelling?
> > > > 
> > > >         Ok, I thought the earlier request was not to add undocumented attributes with the spelling?
> > > > 
> > > >     Right. @Fznamznon, could you document sycl_kernel attribute, please?
> > > > 
> > > > Do we really need add documentation for attribute which is not presented in SYCL spec and used for internal implementation details only because it has spelling?
> > > 
> > > 
> > > 
> > > You are adding an attribute that is exposed to the programmers that use clang to compile their code, so unless you come up with some way to reject it in the non-toolchain mode it has to be documented. And for clang it will become "hidden" SYCL dialect so absolutely not different to `__kernel`.
> > > 
> > > Another aspect to consider is that clang used `TypePrinter` in diagnostics and even though printing entire function signature is rare it might appear in diagnostics and the programmer should have a way to understand what the "alien" construct is. This is where clang documentation will help.
> > > @keryell :
> > > 
> > >     Thank you for the experiment.
> > >     That looks like a straight forward change.
> > >     The interesting part is that it does not expose any advantage from reusing OpenCL __kernel marker.... So I am not more convinced that it is the way to go, because we would use any other keyword or attribute and it would be the same...
> > 
> > I don't understand how this conclusions are made on incomplete implementation or even just an initial patch.
> > 
> > The kind of analysis I am missing at the moment is whether you would need to add similar logic for `sycl_kernel` as we have now for `__kernel` i.e. did anyone look at the occurrences of kernel handling in the code base to see if it's going to need the same logic or not:
> > 
> > ```
> > include/clang/Basic/Attr.td:    : SubsetSubject<Function, [{S->hasAttr<OpenCLKernelAttr>()}],
> > include/clang/Parse/Parser.h:  void ParseOpenCLKernelAttributes(ParsedAttributes &attrs);
> > lib/AST/Decl.cpp:  if (hasAttr<OpenCLKernelAttr>())
> > lib/AST/Decl.cpp:  if (hasAttr<OpenCLKernelAttr>())
> > lib/AST/Decl.cpp:  if (hasAttr<OpenCLKernelAttr>())
> > lib/CodeGen/CGCall.cpp:  if (TargetDecl && TargetDecl->hasAttr<OpenCLKernelAttr>()) {
> > lib/CodeGen/CodeGenFunction.cpp:  if (!FD->hasAttr<OpenCLKernelAttr>())
> > lib/CodeGen/TargetInfo.cpp:    if (FD->hasAttr<OpenCLKernelAttr>()) {
> > lib/CodeGen/TargetInfo.cpp:    if (FD->hasAttr<OpenCLKernelAttr>()) {
> > lib/CodeGen/TargetInfo.cpp:  return D->hasAttr<OpenCLKernelAttr>() ||
> > lib/CodeGen/TargetInfo.cpp:  if (M.getLangOpts().OpenCL && FD->hasAttr<OpenCLKernelAttr>() &&
> > lib/Parse/ParseDecl.cpp:void Parser::ParseOpenCLKernelAttributes(ParsedAttributes &attrs) {
> > lib/Parse/ParseDecl.cpp:      ParseOpenCLKernelAttributes(DS.getAttributes());
> > lib/Sema/SemaDecl.cpp:        if (FD && !FD->hasAttr<OpenCLKernelAttr>()) {
> > lib/Sema/SemaDecl.cpp:        if (FD && FD->hasAttr<OpenCLKernelAttr>()) {
> > lib/Sema/SemaDecl.cpp:  if (getLangOpts().OpenCL && NewFD->hasAttr<OpenCLKernelAttr>()) {
> > lib/Sema/SemaDecl.cpp:        << FD->hasAttr<OpenCLKernelAttr>();
> > lib/Sema/SemaDecl.cpp:  if (FD->hasAttr<OpenCLKernelAttr>())
> > lib/Sema/SemaDeclAttr.cpp:    handleSimpleAttribute<OpenCLKernelAttr>(S, D, AL);
> > lib/Sema/SemaDeclAttr.cpp:  if (!D->hasAttr<OpenCLKernelAttr>()) {
> > ```
> > I don't mind either way but I would like the decision to be based on the analysis of clang code base please!
> > 
> > 
> > > @agozillon :
> > > 
> > >     Just my two cents, I think a separation of concerns and having separate attributes will simplify things long-term.
> > 
> > This can potentially be a fair point!
> > 
> > > 
> > >     While possibly similar just now, the SYCL specification is evolving and may end up targeting more than just OpenCL. So the semantics of the attributes may end up being quite different, even if at the moment the SYCL attribute is there mostly just to mark something for outlining.
> > 
> > This is really great! But unless you provide concrete information what the evolution is and what exactly you are trying to achieve and how it affect compiler design there is no way to review your patches.
> > 
> > 
> > > 
> > >  Let me know how I can help here.
> > > 
> > > Additional note. I've submitted initial version of SYCL compiler design document to the GItHub: https://github.com/intel/llvm/blob/sycl/sycl/doc/SYCL_compiler_and_runtime_design.md. Please, take a look and let me know if you have questions.
> > 
> > Thanks for sharing! I will try to find time to look into this and provide my feedback if any.
> > 
> > 
> > 
> > 
> @Anastasia, I've looked at the occurrences of OpenCLKernelAttr attribute and it looks like the only part useful for SYCL is lib/CodeGen/CodeGenFunction.cpp, which emits OpenCL specific metadata required for SPIR-V translation.
> 
> Sema part is mostly not relevant for SYCL mode because SYCL API do not allow the cases currently detected by clang (e.g. constant address space variable declaration in OpenCL kernel scope, naming OpenCL kernel `main`, etc). 
> A couple of check that might be useful are:
> - `void` return type for kernel functions
> - kernel can't be static function
> and some of the checks are harmful for proposed implementation (e.g. kernels can't be template functions). 
> 
> @Anastasia, @keryell, @agozillon and @aaron.ballman need to agree if this sufficient to justify the re-use of OpenCL kernel attribute.
> Let me know if you need any additional information to make a decision.
> Sema part is mostly not relevant for SYCL mode because SYCL API do not allow the cases currently detected by clang (e.g. constant address space variable declaration in OpenCL kernel scope, naming OpenCL kernel main, etc). 
Would you mind pointing me to your impl of those?

> A couple of check that might be useful are:
> 
>     void return type for kernel functions
>     kernel can't be static function
> 
> and some of the checks are harmful for proposed implementation (e.g. kernels can't be template functions).
> 
> @Anastasia, @keryell, @agozillon and @aaron.ballman need to agree if this sufficient to justify the re-use of OpenCL kernel attribute.
> Let me know if you need any additional information to make a decision.

Ok, if from ~20 occurrences in the source code you will be able to reuse only just 2 it doesn't seem like it's worth to share `__kernel` attribute.

================
Comment at: clang/lib/Sema/SemaSYCL.cpp:23
+
+  bool VisitCallExpr(CallExpr *e) {
+    if (FunctionDecl *Callee = e->getDirectCallee()) {
----------------
bader wrote:
> Anastasia wrote:
> > This is probably not something we can change at this point but I wish we could avoid complexities like this. :(
> > 
> > I think this is also preventing traditional linking of translation units. That is somewhat unfortunate.
> > 
> > It is good direction however to keep this logic in a separate dedicated compilation unit.
> > 
> > I would suggest to document it a bit more including any current limitations/assumption that you can mark under FIXME i.e. does your code handle lambdas yet, what if lambdas are used in function parameters, etc...
> > I think this is also preventing traditional linking of translation units.
> 
> Could you elaborate more on this topic, please?
> What do you mean by "traditional linking of translation units" and what exactly "is preventing" it?
> Do you compare with the linking of regular C++ code (i.e. which do not split into host and device code)?
> If so, SYCL is different from this model and more similar to CUDA/OpenMP models, which also skip "linking" of irrelevant part (e.g. host code is not linked by the device compiler).
> Mariya added Justin (@jlebar) and Alexey (@ABataev), who work on single-source programming models to make them aware and provide feedback if any.
Yes indeed, I mean linking of modules in C/C++ even though it doesn't necessarily mean linking of object files. So you don't plan to support `SYCL_EXTERNAL` in clang?

In CUDA the functions executed on device are annotated manually using `__device__` hence separate translation units can specify external device function... although I don't know if CUDA implementation in clang support this.

I guess OpenMP is allowed to fall back to run on host?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60455/new/

https://reviews.llvm.org/D60455