[Openmp-commits] [openmp] d54712a - [libomptarget][amdgpu] Mark alloc, free weak to facilitate local experimentation
Jon Chesterfield via Openmp-commits
openmp-commits at lists.llvm.org
Fri May 21 08:09:35 PDT 2021
Author: Jon Chesterfield
Date: 2021-05-21T16:09:22+01:00
New Revision: d54712ab4deb5b83ef58db73ce18ed466201f4e1
URL: https://github.com/llvm/llvm-project/commit/d54712ab4deb5b83ef58db73ce18ed466201f4e1
DIFF: https://github.com/llvm/llvm-project/commit/d54712ab4deb5b83ef58db73ce18ed466201f4e1.diff
LOG: [libomptarget][amdgpu] Mark alloc, free weak to facilitate local experimentation
[libomptarget][amdgpu] Mark alloc, free weak to facilitate local experimentation
There are a lot of different ways we might implement the devicertl local alloc
and free functions. Via host, local buffers (stack or arena), specialising per
kernel etc. It is not yet clear what the right design is. This change makes the
alloc and free functions weak, so one can override them from local tests while
comparing options.
Not strictly necessary, as a comparable patch can be applied locally each time,
but would be convenient for out of tree dev. Plan would be to drop the weak
attribute at the same time as introducing a working allocator to trunk.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D102499
Added:
Modified:
openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip
Removed:
################################################################################
diff --git a/openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip b/openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip
index a9663a80b83ce..8398b4d6f0535 100644
--- a/openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip
+++ b/openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip
@@ -52,8 +52,7 @@ EXTERN __kmpc_impl_lanemask_t __kmpc_impl_activemask() {
return __builtin_amdgcn_read_exec();
}
-static void pteam_mem_barrier(uint32_t num_threads, uint32_t * barrier_state)
-{
+static void pteam_mem_barrier(uint32_t num_threads, uint32_t *barrier_state) {
__atomic_thread_fence(__ATOMIC_ACQUIRE);
uint32_t num_waves = (num_threads + WARPSIZE - 1) / WARPSIZE;
@@ -178,8 +177,12 @@ unsigned long long __kmpc_atomic_add(unsigned long long *Address,
}
// Stub implementations
-EXTERN void *__kmpc_impl_malloc(size_t) { return nullptr; }
-EXTERN void __kmpc_impl_free(void *) {}
+// Weak to allow overriding by local versions while comparing
diff erent
+// potential implementations
+__attribute__((weak)) EXTERN void *__kmpc_impl_malloc(size_t) {
+ return nullptr;
+}
+__attribute__((weak)) EXTERN void __kmpc_impl_free(void *) {}
EXTERN void __kmpc_impl_unpack(uint64_t val, uint32_t &lo, uint32_t &hi) {
lo = (uint32_t)(val & UINT64_C(0x00000000FFFFFFFF));
More information about the Openmp-commits
mailing list