[PATCH] D133956: Cuda Check for ignored errors after calling a CUDA kernel

Thu Sep 15 16:07:39 PDT 2022

tra added a comment.

> The intention with this check is to work a lot in tandem with the one in D133804 <https://reviews.llvm.org/D133804>, which therefore prevents most such cases. 
> Thus, the check is optimized for lowering false positives during static checking and for a practice lowering the number of false negatives within the CUDA code.

SGTM. This patch + D133804 <https://reviews.llvm.org/D133804> should have everything covered.

================
Comment at: clang-tools-extra/test/clang-tidy/checkers/cuda/unsafe-kernel-call-macro-handler.cu:53
+  }
+  auto err = cudaGetLastError();
+
----------------
barcisz wrote:
> tra wrote:
> > Just curious -- is it sufficient to just call `cudaGetLastError();` ? Or does the checker require using its result, too? I.e. in practice this particular check will not really do anything useful. The tests below look somewhat inconsistent. 
> Technically it does not require the user to actually use the value of `cudaGetLastError()`, but
> 
> 
>   # If they are calling it then they most likely did not place this call there randomly and are using it to check for the error returned by the kernel
>   # the check being introduced in D133804 can be used to check if the return value has been used, so checking it here as well would have been a duplication
> 
> 
> 1. If they are calling it then they most likely did not place this call there randomly and are using it to check for the error returned by the kernel

If that's the case, then why kernel launches on lines 45 and 51 are reported as possibly unchecked? Both are followed by the `cudaGetLastError()` call and are, technically checked, if we're not analyzing the usage of the result of the call. 

What am I missing?

================
Comment at: clang-tools-extra/test/clang-tidy/checkers/cuda/unsafe-kernel-call-macro-handler.cu:75
+  // CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
+  // Calling an error-checking function after a kernel is not considered safe.
+  errorCheck();
----------------
barcisz wrote:
> tra wrote:
> > WDYM by "is not considered safe" here? How is that different from calling `cudaGetLastError()` and checking its value?
> As in the check does not do inter-procedural analysis short of adding the handler to AcceptedHandlers, so the check will flag such occurences
Hmm.. Using a helper function to check for cuda errors is a fairly common pattern. 
Is there a way to annotate such a helper function as `checks cudaGetLastError`?

================
Comment at: clang-tools-extra/test/clang-tidy/checkers/cuda/unsafe-kernel-call-macro-handler.cu:80-82
+  b<<<1, 2>>>();; /* The semicolons are here because the
+    detection of the macro is done with a lexer */ ;
+  CUDA_CHECK_KERNEL();
----------------
barcisz wrote:
> tra wrote:
> > What would happen with a single `;` as would be seen in the normal user code?
> Nothing, it would work just fine; it's rather that all other kernel calls in this test use a single `;` so I want to check this case here
I still do not understand how it all fits together. What does a kernel call, the extra `;`, the macro, and the checker code have to do with each other?

Is the idea that the checker should see though the empty statement between the kernel call and the checker macro?
If that's the case I'd make it a bit more prominent. E.g. something like this:

```
    b<<<1, 2>>>();
    ; /* Make sure that we see through empty expressions in-between the call and the checker. */ ;
  CUDA_CHECK_KERNEL();
```

================
Comment at: clang-tools-extra/test/clang-tidy/checkers/cuda/unsafe-kernel-call-macro-handler.cu:112
+  do {
+    auto err2 = cudaGetLastError();
+  } while(0);
----------------
barcisz wrote:
> tra wrote:
> > Why does this case produce no warning, while a very similar case above does? In both cases result of `cudaGetLastError()` is assigned to an unused variable within the loop body.
> > 
> > ```
> >   b<<<1, 2>>>();
> >   // CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
> >   for(;;)
> >     auto err2 = cudaGetLastError(); // Brackets omitted purposefully, since they create an additional AST node
> > 
> > ```
> Because often a macro will wrap its error handling code in a do {...} while(0) loop and that's why we check this case and simmilar ones with CFG analysis
The `do/while(0)` wrapping part I understand. I'm puzzled why the checker appears to work differently with different loop kinds. 
Why a `cudaGetLastError()` call inside `do {} while()` is detected and considered as a cuda result check, but the same call within `for() {}` is not?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133956/new/

https://reviews.llvm.org/D133956