[Openmp-commits] [openmp] [OpenMP] Basic BumpAllocator for (AMD)GPUs (PR #69806)

Shilei Tian via Openmp-commits openmp-commits at lists.llvm.org
Fri Oct 20 20:04:29 PDT 2023


================
@@ -0,0 +1,80 @@
+//===------ State.cpp - OpenMP State & ICV interface ------------- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+//===----------------------------------------------------------------------===//
+
+#include "Allocator.h"
+#include "Configuration.h"
+#include "Environment.h"
+#include "Mapping.h"
+#include "Synchronization.h"
+#include "Types.h"
+#include "Utils.h"
+
+using namespace ompx;
+
+#pragma omp begin declare target device_type(nohost)
+
+[[gnu::used, gnu::retain, gnu::weak,
+  gnu::visibility(
+      "protected")]] DeviceMemoryPoolTy __omp_rtl_device_memory_pool;
+[[gnu::used, gnu::retain, gnu::weak,
+  gnu::visibility("protected")]] DeviceMemoryPoolTrackingTy
+    __omp_rtl_device_memory_pool_tracker;
+
+/// Stateless bump allocator that uses the __omp_rtl_device_memory_pool
+/// directly.
+struct BumpAllocatorTy final {
+
+  void *alloc(uint64_t Size) {
+    Size = utils::roundUp(Size, uint64_t(allocator::ALIGNMENT));
+
+    if (config::isDebugMode(DeviceDebugKind::AllocationTracker)) {
+      atomic::add(&__omp_rtl_device_memory_pool_tracker.NumAllocations, 1,
+                  atomic::seq_cst);
+      atomic::add(&__omp_rtl_device_memory_pool_tracker.AllocationTotal, Size,
+                  atomic::seq_cst);
+      atomic::min(&__omp_rtl_device_memory_pool_tracker.AllocationMin, Size,
+                  atomic::seq_cst);
+      atomic::max(&__omp_rtl_device_memory_pool_tracker.AllocationMax, Size,
+                  atomic::seq_cst);
+    }
+
+    uint64_t *Data =
+        reinterpret_cast<uint64_t *>(&__omp_rtl_device_memory_pool.Ptr);
+    uint64_t End =
+        reinterpret_cast<uint64_t>(Data) + __omp_rtl_device_memory_pool.Size;
+
+    uint64_t OldData = atomic::add(Data, Size, atomic::seq_cst);
+    if (OldData + Size > End)
+      __builtin_trap();
----------------
shiltian wrote:

Based on my experience it's not always easy to expose a dereference `nullptr` on a GPU. Given what we have right now in this allocator, it is better to just trap if we don't have enough memory.

https://github.com/llvm/llvm-project/pull/69806


More information about the Openmp-commits mailing list