[Openmp-commits] [openmp] [OpenMP] Add Environment Variable to disable Reuse of Blocks for High Loop Trip Counts (PR #89239)

Tim Gymnich via Openmp-commits openmp-commits at lists.llvm.org
Thu Apr 18 07:26:49 PDT 2024


https://github.com/tgymnich created https://github.com/llvm/llvm-project/pull/89239

Sometimes it might be beneficial to spawn more thread blocks instead of reusing existing for multiple loop iterations.

**Alternatives considered:**

Make `DefaultNumBlocks` settable via an environment variable.

>From 46f10905f82a39d570d6e1879b85aa12f90d37c1 Mon Sep 17 00:00:00 2001
From: Tim Gymnich <tgymnich at icloud.com>
Date: Wed, 10 Apr 2024 18:39:02 +0000
Subject: [PATCH] GenericDevice.getReuseBlocksForHighTripCount

---
 .../plugins-nextgen/common/include/PluginInterface.h     | 9 +++++++++
 .../plugins-nextgen/common/src/PluginInterface.cpp       | 6 +++++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/openmp/libomptarget/plugins-nextgen/common/include/PluginInterface.h b/openmp/libomptarget/plugins-nextgen/common/include/PluginInterface.h
index 79e8464bfda5c1..97fa9616e02233 100644
--- a/openmp/libomptarget/plugins-nextgen/common/include/PluginInterface.h
+++ b/openmp/libomptarget/plugins-nextgen/common/include/PluginInterface.h
@@ -829,6 +829,12 @@ struct GenericDeviceTy : public DeviceAllocatorTy {
     return OMPX_MinThreadsForLowTripCount;
   }
 
+  /// Whether or not to reuse blocks for high trip count loops.
+  /// @see OMPX__ReuseBlocksForHighTripCount
+  virtual bool getReuseBlocksForHighTripCount() {
+    return OMPX__ReuseBlocksForHighTripCount;
+  }
+
   /// Get the total amount of hardware parallelism supported by the target
   /// device. This is the total amount of warps or wavefronts that can be
   /// resident on the device simultaneously.
@@ -904,6 +910,9 @@ struct GenericDeviceTy : public DeviceAllocatorTy {
   UInt32Envar OMPX_MinThreadsForLowTripCount =
       UInt32Envar("LIBOMPTARGET_MIN_THREADS_FOR_LOW_TRIP_COUNT", 32);
 
+  BoolEnvar OMPX__ReuseBlocksForHighTripCount = 
+      BoolEnvar("LIBOMPTARGET_REUSE_BLOCKS_FOR_HIGH_TRIP_COUNT", true);
+
 protected:
   /// Environment variables defined by the LLVM OpenMP implementation
   /// regarding the initial number of streams and events.
diff --git a/openmp/libomptarget/plugins-nextgen/common/src/PluginInterface.cpp b/openmp/libomptarget/plugins-nextgen/common/src/PluginInterface.cpp
index b5f3c45c835fdb..41542ea1123c29 100644
--- a/openmp/libomptarget/plugins-nextgen/common/src/PluginInterface.cpp
+++ b/openmp/libomptarget/plugins-nextgen/common/src/PluginInterface.cpp
@@ -705,8 +705,12 @@ uint64_t GenericKernelTy::getNumBlocks(GenericDeviceTy &GenericDevice,
       TripCountNumBlocks = LoopTripCount;
     }
   }
+
+  uint32_t PreferredNumBlocks = TripCountNumBlocks;
   // If the loops are long running we rather reuse blocks than spawn too many.
-  uint32_t PreferredNumBlocks = std::min(TripCountNumBlocks, DefaultNumBlocks);
+  if (GenericDevice.getReuseBlocksForHighTripCount()) {
+    PreferredNumBlocks = std::min(TripCountNumBlocks, DefaultNumBlocks);
+  }
   return std::min(PreferredNumBlocks, GenericDevice.getBlockLimit());
 }
 



More information about the Openmp-commits mailing list