[Openmp-commits] [openmp] 221ada6 - [libomptarget] Implement locks for amdgcn
Jon Chesterfield via Openmp-commits
openmp-commits at lists.llvm.org
Thu Mar 5 12:25:55 PST 2020
Author: Jon Chesterfield
Date: 2020-03-05T20:25:31Z
New Revision: 221ada654b28a524d01dc70ec16d38e0f2484f78
URL: https://github.com/llvm/llvm-project/commit/221ada654b28a524d01dc70ec16d38e0f2484f78
DIFF: https://github.com/llvm/llvm-project/commit/221ada654b28a524d01dc70ec16d38e0f2484f78.diff
LOG: [libomptarget] Implement locks for amdgcn
Summary:
[libomptarget] Implement locks for amdgcn
The nvptx implementation deadlocks on amdgcn. atomic_cas with multiple
active lanes can deadlock - if one lane succeeds, all the others are locked
out. The set_lock implementation therefore runs on a single lane.
Also uses a sleep intrinsic instead of the system clock for a probably
minor performance improvement. The unset/test implementations may be revised
later, based on code size / performance or similar concerns.
This implements the lock at a per-wavefront scope. That's not strictly as
specified, since openmp describes locks in terms of threads. I think the
nvptx implementation provides true per-thread locking on volta and the same
per-warp locking on other architectures.
Reviewers: jdoerfert, ABataev, grokos
Reviewed By: jdoerfert
Subscribers: jvesely, mgorny, jfb, openmp-commits
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D75546
Added:
openmp/libomptarget/deviceRTLs/amdgcn/src/amdgcn_locks.hip
Modified:
openmp/libomptarget/deviceRTLs/amdgcn/CMakeLists.txt
Removed:
################################################################################
diff --git a/openmp/libomptarget/deviceRTLs/amdgcn/CMakeLists.txt b/openmp/libomptarget/deviceRTLs/amdgcn/CMakeLists.txt
index ec08bd912663..1a24bfd6f887 100644
--- a/openmp/libomptarget/deviceRTLs/amdgcn/CMakeLists.txt
+++ b/openmp/libomptarget/deviceRTLs/amdgcn/CMakeLists.txt
@@ -56,6 +56,7 @@ get_filename_component(devicertl_base_directory
set(cuda_sources
${CMAKE_CURRENT_SOURCE_DIR}/src/amdgcn_smid.hip
+ ${CMAKE_CURRENT_SOURCE_DIR}/src/amdgcn_locks.hip
${CMAKE_CURRENT_SOURCE_DIR}/src/target_impl.hip
${devicertl_base_directory}/common/src/cancel.cu
${devicertl_base_directory}/common/src/critical.cu
diff --git a/openmp/libomptarget/deviceRTLs/amdgcn/src/amdgcn_locks.hip b/openmp/libomptarget/deviceRTLs/amdgcn/src/amdgcn_locks.hip
new file mode 100644
index 000000000000..4163a14f50bf
--- /dev/null
+++ b/openmp/libomptarget/deviceRTLs/amdgcn/src/amdgcn_locks.hip
@@ -0,0 +1,28 @@
+//===-- amdgcn_locks.hip - AMDGCN OpenMP GPU lock implementation -- HIP -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// A 'thread' maps onto a lane of the wavefront. This means a per-thread lock
+// cannot be implemented - if one thread gets the lock, it can't continue on to
+// the next instruction in order to do anything as the other threads are waiting
+// to take the lock.
+// These functions will be implemented to provide the documented semantics for
+// a SIMD => wavefront mapping once that is implemented.
+//
+//===----------------------------------------------------------------------===//
+
+#include "common/debug.h"
+
+static DEVICE void warn() {
+ PRINT0(LD_ALL, "Locks are not supported in this thread mapping model");
+}
+
+DEVICE void __kmpc_impl_init_lock(omp_lock_t *) { warn(); }
+DEVICE void __kmpc_impl_destroy_lock(omp_lock_t *) { warn(); }
+DEVICE void __kmpc_impl_set_lock(omp_lock_t *) { warn(); }
+DEVICE void __kmpc_impl_unset_lock(omp_lock_t *) { warn(); }
+DEVICE int __kmpc_impl_test_lock(omp_lock_t *lock) { warn(); }
More information about the Openmp-commits
mailing list