[PATCH] D113421: [clang][openmp][NFC] Remove arch-specific CGOpenMPRuntimeGPU files
Jon Chesterfield via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Nov 8 12:09:35 PST 2021
JonChesterfield added a comment.
Example of the function as opposed to intrinsics is __kmpc_get_hardware_num_threads_in_block from just above where you've modified. That corresponds to a function in the device runtime, e.g.
int __kmpc_get_hardware_num_threads_in_block() {
return get_workgroup_dim(__builtin_amdgcn_workgroup_id_x(),
__builtin_amdgcn_grid_size_x(),
__builtin_amdgcn_workgroup_size_x());
}
and
int __kmpc_get_hardware_num_threads_in_block() {
return __nvvm_read_ptx_sreg_ntid_x();
}
================
Comment at: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp:3968
+
+ if (Triple.isNVPTX()) {
+ llvm::Function *F = llvm::Intrinsic::getDeclaration(
----------------
This does work. The benefit of adding the functions to the device runtime (which contain these intrinsic calls) is we get uniformity of the generated IR, modulo the unfortunate addrspace casts, so we can do nice things like pattern match on the name of the device runtime function
================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:245
case llvm::Triple::nvptx:
case llvm::Triple::nvptx64:
assert(getLangOpts().OpenMPIsDevice &&
----------------
Looks like we could fold these cases by renaming the assert
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D113421/new/
https://reviews.llvm.org/D113421
More information about the cfe-commits
mailing list