[PATCH] D68578: [HIP] Fix device stub name

Thu Nov 14 15:22:46 PST 2019

tra added inline comments.

================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1099-1108
+    auto *FD = const_cast<NamedDecl *>((ND));
+    if (auto *TD = cast<FunctionDecl>(FD)->getPrimaryTemplate())
+      FD = TD->getTemplatedDecl();
+    auto OldDeclName = FD->getDeclName();
+    auto NewNameStr = std::string("__device_stub__") + OldDeclName.getAsString();
+    auto *NewId = &Context.Idents.get(NewNameStr);
+    auto NewDeclName = DeclarationName(NewId);
----------------
yaxunl wrote:
> tra wrote:
> > yaxunl wrote:
> > > tra wrote:
> > > > On one hand I like this patch variant much better than the one that changed the mangling itself.
> > > > On the other hand this code appears to reply on implementation details. I.e. we're setting new name on `FD` which may or may not be the same as `ND`, but we're always passing `ND` to `getMangledNameImpl()`. 
> > > > 
> > > > Perhaps we could implement name-tweaking as another `MultiVersionKind` which we already plumb into getMangledNameImpl() and which allows changing the name for target attributes & features.
> > > > 
> > > > 
> > > The mangled name of an instantiated template function does not depends on its own name, but on the template. If we do not want to depend on this implementation detail, it seems I have to clone the template and instantiate from the clone.
> > > 
> > > MultiVersion does not help us here since it only appends .postfix to mangled name. The obstacle we are facing is how to change the unmangled name.
> > > The mangled name of an instantiated template function does not depends on its own name, but on the template. If we do not want to depend on this implementation detail, it seems I have to clone the template and instantiate from the clone.
> > 
> > That would be putting more effort into working around the fact that `getMangledNameImpl()` doesa not provide a good API to change the name the way you need to. *That's* what needs to be addressed, IMO.
> > 
> > >MultiVersion does not help us here since it only appends .postfix to mangled name. The obstacle we are facing is how to change the unmangled name.
> > 
> > *Some* existing implementations append to the mangled name, but we can do other manipulations there, too. The string with the mangled name originates in `getMangledNameImpl` and we could do more than just append to it. We do not have to use the `MultiVersion` for that, either. E.g. we prepend `__regcall3__` to the names of functions with `CC_X86RegCall` calling convention. We could do something similar for the kernel stub, I wonder if we could just generate a unique name and be done with that?
> > 
> > Hmm. Unique name probably would not do if, let's say, a kernel is defined in one TU, but we need to call it from another TU. So, whichever way we change the name of the stub, it will need to be the same everywhere. 
> > You may want to add a test verifying that launching of declaration-only kernels uses the right name.
> > 
> > Consistency of name mangling means that we do need to include regular C++-mangled information. Which means we need to do the name tweaking deeper down. How about using calling conventions? It's been suggested in the past that a lot of shenanigans around kernel launches could/should be done as a different calling convention. One of the things affected by the calling convention is mangling and we can add prefix there: https://github.com/llvm/llvm-project/blob/master/clang/lib/AST/Mangle.cpp#L164
> > 
> > We could tag host-side kernel with 'kernel call' calling convention on the host side and then plumb prefixing to be done similar to `__regcall3__`.
> > 
> > If that works that may be a useful improvement overall. For instance, we may no longer need to stash a `it's a kernel` flag among attributes and it would probably be useful for other things (e.g enforcing address space requirements for kernel pointer arguments).
> > 
> will add a test for decl only kernel. At least for the current implementation I see it works. A decl of stub function with expected name is emitted and can be called.
> 
> About calling conv. I've tried implementing `__global__` as a calling conv before. The issue is that it is part of type system and clang enforces type checking for that. e.g. you cannot assign it to an ordinary function pointer unless that function pointer is also declared with the same calling convention. This will cause lots of type mismatching issues. In CUDA language, `__global__` is not part of type system since it is just an attribute.
> 
> We could introduce a calling conv for stub, but probably we can only use use it when we mangle the stub function.
> 
> 
OK. I'm fresh out of ideas.

We should add some sort of assert to make sure that the mangled name does have the prefix we intended to add. Also a TODO to figure out a better way to add a name prefix before mangling. 

If anyone else has other suggestions, please chime in.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68578/new/

https://reviews.llvm.org/D68578