[Openmp-commits] [PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

Tue Oct 26 12:54:20 PDT 2021

JonChesterfield added a comment.

I think this is good enough for now. It drops the not yet used debug variable and writes out the lowering for runtime values of memory ordering manually. The latter will be simplified once clang learns to emit the switch instead of error. Omp lock is a problem I don't have a good solution to.

================
Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp:255
                          options::OPT_fno_openmp_target_new_runtime, false))
-    BitcodeSuffix = "new-amdgcn-" + GPUArch;
+    BitcodeSuffix = "new-amdgpu-" + GPUArch;
   else
----------------
'amdgcn' appears to be a subset of 'amdgpu', so this seems a reasonable point to rename it.

================
Comment at: openmp/libomptarget/DeviceRTL/CMakeLists.txt:180

   # Link to a bitcode library.
+  add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/linked_${bclib_name}
----------------
rearranging the naming here - the llvm-link file is now prefixed linked_, with the optimised library left without a prefix. Updated depends / output clauses to match.

================
Comment at: openmp/libomptarget/DeviceRTL/src/Configuration.cpp:23

-extern uint32_t __omp_rtl_debug_kind;
+// extern uint32_t __omp_rtl_debug_kind;

----------------
JonChesterfield wrote:
> jdoerfert wrote:
> > JonChesterfield wrote:
> > > Otherwise the missing symbols prevents linking, not clear why it works on nvptx64
> > linking what? Clang emits the symbol, maybe just not for amdgpu.
> Where? The only reference I can find to it is here, and it's marked extern.
I think the cuda toolchain treats unresolved references as 'just use zero', in which case deleting this is a no-op on nvptx. Maybe it's intended to be patched by cuda rtl.cpp in the future? If so can reintroduce it then

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112227/new/

https://reviews.llvm.org/D112227