[PATCH] D113421: [clang][openmp][NFC] Remove arch-specific CGOpenMPRuntimeGPU files

Jon Chesterfield via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Nov 8 12:09:35 PST 2021


JonChesterfield added a comment.

Example of the function as opposed to intrinsics is __kmpc_get_hardware_num_threads_in_block from just above where you've modified. That corresponds to a function in the device runtime, e.g.

  int __kmpc_get_hardware_num_threads_in_block() {
    return get_workgroup_dim(__builtin_amdgcn_workgroup_id_x(),
                             __builtin_amdgcn_grid_size_x(),
                             __builtin_amdgcn_workgroup_size_x());
  }

and

  int __kmpc_get_hardware_num_threads_in_block() {
    return __nvvm_read_ptx_sreg_ntid_x();
  }



================
Comment at: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp:3968
+
+  if (Triple.isNVPTX()) {
+    llvm::Function *F = llvm::Intrinsic::getDeclaration(
----------------
This does work. The benefit of adding the functions to the device runtime (which contain these intrinsic calls) is we get uniformity of the generated IR, modulo the unfortunate addrspace casts, so we can do nice things like pattern match on the name of the device runtime function


================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:245
   case llvm::Triple::nvptx:
   case llvm::Triple::nvptx64:
     assert(getLangOpts().OpenMPIsDevice &&
----------------
Looks like we could fold these cases by renaming the assert


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113421/new/

https://reviews.llvm.org/D113421



More information about the cfe-commits mailing list