[PATCH] D64015: [WIP][CUDA] Use shared MangleContext for CUDA and CXX CG

Philip Salzmann via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Jul 1 08:37:07 PDT 2019


psalz created this revision.
psalz added reviewers: hliao, tra, aheejin.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

NOTE: This is a work in progress and mainly intended to highlight the issue - i.e., I'm not certain the provided solution is appropriate.

Given this CUDA program

  template<typename Lambda>
  __global__ void run_this(Lambda lambda) {
      lambda();
  }
  
  template<typename T>
  struct remove_reference {
      using type = T;
  };
  
  template<typename T>
  struct remove_reference<T&> {
      using type = T;
  };
  
  template<typename T>
  constexpr typename remove_reference<T>::type&& move(T&& t) {
      return static_cast<typename remove_reference<T>::type&&>(t);
  }
  
  int main() {
      auto foo = move([](){});
      run_this<<<1, 1, 1>>>([]() __device__ { printf("Hello World\n"); }); 
      return 0;
  }

the assertion at the top of `CGNVCUDARuntime::emitDeviceStub` will fail. For release builds the effect is simply a `cudaErrorInvalidDeviceFunction` error at run time. The reason for this is that the mangled names of the device stub and the actual device side function differ: The stub is called `_Z8run_thisIZ4mainE3$_1EvT_`, while the device function is `_Z8run_thisIZ4mainE3$_0EvT_`. The difference comes down to the anonymous struct ID that is maintained and assigned by the `ManglerContext`. It appears that for the latter `getAnonymousStructId` is never called for the moved no-op lambda, resulting in an ID of 0 for the kernel.

My proposed solution would be to simply share the `ManglerContext` used by the `CGNVCUDARuntime` and `CGCXXABI` code generators. For this I've added a new `ASTContext::getSharedMangleContext` function that memoizes created manglers for the given target ABI. From looking at `ManglerContext` to me at least it doesn't look like that could cause any issues, but then again, I really don't know much about Clang's internals.

Of course an alternative solution could be to make sure that `getAnonymousStructId` is always called for both lambdas (and in the correct order), but again I don't really know why that is not happening in the first place.


Repository:
  rC Clang

https://reviews.llvm.org/D64015

Files:
  include/clang/AST/ASTContext.h
  lib/AST/ASTContext.cpp
  lib/CodeGen/CGCUDANV.cpp
  lib/CodeGen/CGCXXABI.h

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D64015.207330.patch
Type: text/x-patch
Size: 4668 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20190701/9c7e6f60/attachment.bin>


More information about the cfe-commits mailing list