[PATCH] D133956: Cuda Check for ignored errors after calling a CUDA kernel

Thu Sep 15 16:51:32 PDT 2022

barcisz added inline comments.

================
Comment at: clang-tools-extra/test/clang-tidy/checkers/cuda/unsafe-kernel-call-macro-handler.cu:53
+  }
+  auto err = cudaGetLastError();
+
----------------
tra wrote:
> barcisz wrote:
> > tra wrote:
> > > Just curious -- is it sufficient to just call `cudaGetLastError();` ? Or does the checker require using its result, too? I.e. in practice this particular check will not really do anything useful. The tests below look somewhat inconsistent. 
> > Technically it does not require the user to actually use the value of `cudaGetLastError()`, but
> > 
> > 
> >   # If they are calling it then they most likely did not place this call there randomly and are using it to check for the error returned by the kernel
> >   # the check being introduced in D133804 can be used to check if the return value has been used, so checking it here as well would have been a duplication
> > 
> > 
> > 1. If they are calling it then they most likely did not place this call there randomly and are using it to check for the error returned by the kernel
> 
> If that's the case, then why kernel launches on lines 45 and 51 are reported as possibly unchecked? Both are followed by the `cudaGetLastError()` call and are, technically checked, if we're not analyzing the usage of the result of the call. 
> 
> What am I missing?
The idea is that the call should happen directly after the kernel without any branching (because branching can often make things much harder to understand in case of things like for loop make the error not actually have `cudaGetLastError()` called after every kernel call

================
Comment at: clang-tools-extra/test/clang-tidy/checkers/cuda/unsafe-kernel-call-macro-handler.cu:75
+  // CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
+  // Calling an error-checking function after a kernel is not considered safe.
+  errorCheck();
----------------
tra wrote:
> barcisz wrote:
> > tra wrote:
> > > WDYM by "is not considered safe" here? How is that different from calling `cudaGetLastError()` and checking its value?
> > As in the check does not do inter-procedural analysis short of adding the handler to AcceptedHandlers, so the check will flag such occurences
> Hmm.. Using a helper function to check for cuda errors is a fairly common pattern. 
> Is there a way to annotate such a helper function as `checks cudaGetLastError`?
There would be an easy way to do that, but it's much more common for projects to have those helper functions project-wide (or at least sub-project wide) which means they can be just specified explicitly for the project in the options for the check (the official documentation for the check will be uploaded tomorrow)

================
Comment at: clang-tools-extra/test/clang-tidy/checkers/cuda/unsafe-kernel-call-macro-handler.cu:80-82
+  b<<<1, 2>>>();; /* The semicolons are here because the
+    detection of the macro is done with a lexer */ ;
+  CUDA_CHECK_KERNEL();
----------------
tra wrote:
> barcisz wrote:
> > tra wrote:
> > > What would happen with a single `;` as would be seen in the normal user code?
> > Nothing, it would work just fine; it's rather that all other kernel calls in this test use a single `;` so I want to check this case here
> I still do not understand how it all fits together. What does a kernel call, the extra `;`, the macro, and the checker code have to do with each other?
> 
> Is the idea that the checker should see though the empty statement between the kernel call and the checker macro?
> If that's the case I'd make it a bit more prominent. E.g. something like this:
> 
> ```
>     b<<<1, 2>>>();
>     ; /* Make sure that we see through empty expressions in-between the call and the checker. */ ;
>   CUDA_CHECK_KERNEL();
> ```
> 
> 
The reason we're checking for multiple `;`s here is that due to macros not being present in the AST they have to be located on the lexer stage, which makes it necessary to search for them based on tokens. The tokens used after the kernel call here (semicolons and a comment) are the only allowed token between the kernel call and the macro, since any other one would indicate another statement being present

================
Comment at: clang-tools-extra/test/clang-tidy/checkers/cuda/unsafe-kernel-call-macro-handler.cu:112
+  do {
+    auto err2 = cudaGetLastError();
+  } while(0);
----------------
tra wrote:
> barcisz wrote:
> > tra wrote:
> > > Why does this case produce no warning, while a very similar case above does? In both cases result of `cudaGetLastError()` is assigned to an unused variable within the loop body.
> > > 
> > > ```
> > >   b<<<1, 2>>>();
> > >   // CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
> > >   for(;;)
> > >     auto err2 = cudaGetLastError(); // Brackets omitted purposefully, since they create an additional AST node
> > > 
> > > ```
> > Because often a macro will wrap its error handling code in a do {...} while(0) loop and that's why we check this case and simmilar ones with CFG analysis
> The `do/while(0)` wrapping part I understand. I'm puzzled why the checker appears to work differently with different loop kinds. 
> Why a `cudaGetLastError()` call inside `do {} while()` is detected and considered as a cuda result check, but the same call within `for() {}` is not?
This is because `for(<something>;<something>;<something>)` works differently in the CFG analysis. The precise definition for where the cudaGetLastError() is allowed is that it should be the first statement/expression tree/function call after the kernel call and should be in a straight line from it. For example, `for(;;) {cudaGetLastError()}` would not be similifiable to a single control flow block, and `for(;false;) {cudaGetLastError()}` has an expression tree evaluated before the call to `cudaGetLastError()`. Technically , this definition currently only supports wrapping the statement with `cudaGetLastError()`, but is made more general in case

  - The user uses gotos to achieve a simmilar pattern
  - Such CFG layout can be achieved with other c++ mechanisms
  - Means to achieve such CFG layout with different mechanisms appear in future standards of c++

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133956/new/

https://reviews.llvm.org/D133956