[PATCH] D71227: [cuda][hip] Fix function overload resolution in the global initiailizer.

Tue Jan 21 08:24:01 PST 2020

hliao marked 2 inline comments as done.
hliao added a comment.

Sorry for the late reply. Really appreciate your feedback. Thanks!

================
Comment at: clang/include/clang/Sema/Sema.h:11198-11206
+  SmallVector<const Decl *, 8> CUDANonLocalVariableStack;
+
+  void pushCUDANonLocalVariable(const Decl *D);
+  void popCUDANonLocalVariable(const Decl *D);
+
+  const Decl *getCUDACurrentNonLocalVariable() const {
+    return CUDANonLocalVariableStack.empty() ? nullptr
----------------
rsmith wrote:
> Does this really need to be CUDA-specific?
> 
> This is (at least) the third time we've needed this. We currently have a `ManglingContextDecl` on `ExpressionEvaluationContextRecord` that tracks the non-local variable whose initializer we're parsing. In addition to using this as a lambda context declaration, we also (hackily) use it as the context declaration for `DiagRuntimeBehavior`. It would seem sensible to use that mechanism here too (and rename it to remove any suggestion that this is specific to lambdas or mangling).
> 
> I think we only currently push `ExpressionEvaluationContext`s for variable initializers in C++. That's presumably fine for CUDA's purposes.
I tried that before adding the new non-local variable stack. Using `ManglingContextDecl` on `ExpressionEvaluationContextRecord` could serve some cases, but it cannot fit the case where the constructor needs resolving as well. When resolving the constructor, `ManglingContextDecl` scope is already closed and cannot be used to check the target of the global variables. Says the following code

```
struct EC {
  int ec;
__device__ EC() {}
};
__device__ EC d_ec;
```

I also tried enlarging the scope of `ManglingContextDecl` but that triggers even more issues for the generic C++ compilation. I'd appreciate any better solution as I agree that adding CUDA specific facilities should be minimized.

================
Comment at: clang/lib/Parse/ParseDecl.cpp:2336

+  Actions.pushCUDANonLocalVariable(ThisDecl);
+
----------------
rsmith wrote:
> tra wrote:
> > @rsmith -- is this sufficient to catch all attempts to call an initializer for a global?
> > I wonder if there are other sneaky ways to call an initializer. 
> No, this is not sufficient; it's missing (at least) the template instantiation case. (The `ExpressionEvaluationContextRecord` mechanism does handle that case properly.)
> 
> You should also consider what should happen in default arguments (which are sometimes parsed before we form a `FunctionDecl` for the function for which they are parameters) and default member initializers (which are parsed after we know whether the enclosing class has a user-declared default constructor, so you could in principle consider the CUDA function kind of the declared constructors, I suppose -- but the constructor bodies are not yet available, so you can't tell which constructors would actually use the initializers). Both of those cases are also tracked by the `ExpressionEvaluationContextRecord` mechanism, though you may need to track additional information to process default arguments in the same mode as the function for which they are supplied.
Could you elaborate more? I added new test cases requiring template instantiation. The current code handle them correctly. Do you refer to template variables?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71227/new/

https://reviews.llvm.org/D71227