[PATCH] D124866: [CUDA][HIP] support noinline as keyword

Mon May 9 12:34:51 PDT 2022

yaxunl marked 4 inline comments as done.
yaxunl added a comment.

In D124866#3501203 <https://reviews.llvm.org/D124866#3501203>, @aaron.ballman wrote:

>> `__forceinline__` does not have the issue as `__noinline__` has since it is not a GCC attribute. The current CUDA/HIP implementation of `__forceinline__` in header files is sufficient. I do not see the benefit of implementing `__forceinline__` as a keyword.
>
> Primarily to reduce user confusion. It's kind of weird for `__noinline__` to be a keyword and `__forceinline__` to not be a keyword when they're both defined the same way by the CUDA spec. This means you can #undef one of them but not the other, that sort of thing.

If we are to add `__forceinline__` as a keyword, I feel it better be a separate patch to be cleaner.

================
Comment at: clang/lib/Parse/ParseDecl.cpp:902
+  while (Tok.is(tok::kw___noinline__)) {
+    IdentifierInfo *AttrName = Tok.getIdentifierInfo();
+    SourceLocation AttrNameLoc = ConsumeToken();
----------------
tra wrote:
> aaron.ballman wrote:
> > yaxunl wrote:
> > > aaron.ballman wrote:
> > > > I think we should we be issuing a pedantic "this is a clang extension" warning here, WDYT?
> > > will do
> > I'm questioning whether my advice here was good or not -- now that I see the CUDA spec already calls these function qualifiers... it's debatable whether this is a Clang extension or just the way in which Clang implements the CUDA function qualifiers. @tra -- do you have opinions?
> > 
> > I'm sort of leaning towards dropping the extension warning, but the only reason I can think of for keeping it is if Clang is the only CUDA compiler that doesn't require you to include a header before using the function qualifiers. If that's the case, there is a portability concern.
> I'm not sure if such a warning would be useful. 
> 
> > the only reason I can think of for keeping it is if Clang is the only CUDA compiler that doesn't require you to include a header before using the function qualifiers. If that's the case, there is a portability concern.
> 
> I don't think it's an issue.
> 
> We already have similar divergence between nvcc/clang. E.g. built-in variables like `threadIdx`. Clang implements them in a header, but NVCC provides them by compiler itself. 
> With both compilers the variables are available by the time we get to compile user code. Virtually all CUDA compilations are done with tons of CUDA headers pre-included by compiler. Those that do not do that are already on their own and have to provide many other 'standard' CUDA things like target attributes. I don't think we need to worry about that.
> 
I can remove the diagnostics since it seems unnecessary.

I tend to treat it as an extension since nvcc is the de facto standard implementation, which does not implement it as a keyword. Compared to that, this is like an extension.

================
Comment at: clang/test/CodeGenCUDA/noinline.cu:1
+// optimization is needed, otherwise by default all functions have noinline.
+
----------------
erichkeane wrote:
> aaron.ballman wrote:
> > I've asked @erichkeane to weigh in on whether there's a better approach here than specifying an optimization level.
> You don't need to do this, it looks like all you're trying to do is keep 'clang' out of `O0` mode.  However, what you do  NOT want is the optimizations to run.  The common way to do that is to combine `O1`/`O2`/etc like: `-O2 -disable-llvm-passes`
> 
> This will keep clang in `O2` mode, but will keep the optimizer from running anything, which might mess with the test later on.
will use -O2 -disable-llvm-passes

================
Comment at: clang/test/SemaCUDA/noinline.cu:8
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }
----------------
aaron.ballman wrote:
> yaxunl wrote:
> > aaron.ballman wrote:
> > > I think there should also be a test like:
> > > ```
> > > [[gnu::__noinline__]] void fun4() {}
> > > ```
> > > to verify that the double square bracket syntax also correctly handles this being a keyword now (I expect the test to pass).
> > will do
> Ah, I just noticed we also have no tests for the behavior of the keyword in the presence of the macro being defined. e.g.,
> ```
> #define __noinline__ __attribute__((__noinline__))
> __noinline__ void fun5() {}
> ```
will do

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

[PATCH] D124866: [CUDA][HIP] support noinline as keyword