[PATCH] D64015: [WIP][CUDA] Use shared MangleContext for CUDA and CXX CG
Philip Salzmann via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Jul 1 08:37:07 PDT 2019
psalz created this revision.
psalz added reviewers: hliao, tra, aheejin.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.
NOTE: This is a work in progress and mainly intended to highlight the issue - i.e., I'm not certain the provided solution is appropriate.
Given this CUDA program
template<typename Lambda>
__global__ void run_this(Lambda lambda) {
lambda();
}
template<typename T>
struct remove_reference {
using type = T;
};
template<typename T>
struct remove_reference<T&> {
using type = T;
};
template<typename T>
constexpr typename remove_reference<T>::type&& move(T&& t) {
return static_cast<typename remove_reference<T>::type&&>(t);
}
int main() {
auto foo = move([](){});
run_this<<<1, 1, 1>>>([]() __device__ { printf("Hello World\n"); });
return 0;
}
the assertion at the top of `CGNVCUDARuntime::emitDeviceStub` will fail. For release builds the effect is simply a `cudaErrorInvalidDeviceFunction` error at run time. The reason for this is that the mangled names of the device stub and the actual device side function differ: The stub is called `_Z8run_thisIZ4mainE3$_1EvT_`, while the device function is `_Z8run_thisIZ4mainE3$_0EvT_`. The difference comes down to the anonymous struct ID that is maintained and assigned by the `ManglerContext`. It appears that for the latter `getAnonymousStructId` is never called for the moved no-op lambda, resulting in an ID of 0 for the kernel.
My proposed solution would be to simply share the `ManglerContext` used by the `CGNVCUDARuntime` and `CGCXXABI` code generators. For this I've added a new `ASTContext::getSharedMangleContext` function that memoizes created manglers for the given target ABI. From looking at `ManglerContext` to me at least it doesn't look like that could cause any issues, but then again, I really don't know much about Clang's internals.
Of course an alternative solution could be to make sure that `getAnonymousStructId` is always called for both lambdas (and in the correct order), but again I don't really know why that is not happening in the first place.
Repository:
rC Clang
https://reviews.llvm.org/D64015
Files:
include/clang/AST/ASTContext.h
lib/AST/ASTContext.cpp
lib/CodeGen/CGCUDANV.cpp
lib/CodeGen/CGCXXABI.h
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D64015.207330.patch
Type: text/x-patch
Size: 4668 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20190701/9c7e6f60/attachment.bin>
More information about the cfe-commits
mailing list