[PATCH] D120079: [OpenMP] Add an option to limit shared memory usage in OpenMPOpt
Joseph Huber via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 17 12:50:50 PST 2022
jhuber6 created this revision.
jhuber6 added reviewers: jdoerfert, tianshilei1992.
Herald added subscribers: ormris, guansong, hiraditya, yaxunl.
jhuber6 requested review of this revision.
Herald added subscribers: llvm-commits, sstefan1.
Herald added a project: LLVM.
One of the optimizations performed in OpenMPOpt pushes globalized
variables to static shared memory. This is preferable to keeping the
runtime call in all cases, however if too many variables are pushed to
hared memory the kernel will crash. Since this is an optimization and
not something the user specified explicitly, there should be an option
to limit this optimization in those cases. This path introduces the
`-openmp-opt-shared-limit=` option to limit the amount of bytes that
will be placed in shared memory from HeapToShared.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D120079
Files:
llvm/lib/Transforms/IPO/OpenMPOpt.cpp
llvm/test/Transforms/OpenMP/replace_globalization.ll
Index: llvm/test/Transforms/OpenMP/replace_globalization.ll
===================================================================
--- llvm/test/Transforms/OpenMP/replace_globalization.ll
+++ llvm/test/Transforms/OpenMP/replace_globalization.ll
@@ -1,6 +1,7 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --function-signature --check-attributes --check-globals --include-generated-funcs
; RUN: opt -S -passes='openmp-opt' < %s | FileCheck %s
; RUN: opt -passes=openmp-opt -pass-remarks=openmp-opt -disable-output < %s 2>&1 | FileCheck %s -check-prefix=CHECK-REMARKS
+; RUN: opt -passes=openmp-opt -pass-remarks=openmp-opt -pass-remarks-missed=openmp-opt -disable-output -openmp-opt-shared-limit=4 < %s 2>&1 | FileCheck %s -check-prefix=CHECK-LIMIT
target datalayout = "e-i64:64-i128:128-v16:16-v32:32-n16:32:64"
target triple = "nvptx64"
@@ -8,6 +9,8 @@
; CHECK-REMARKS: remark: replace_globalization.c:5:7: Replaced globalized variable with 16 bytes of shared memory
; CHECK-REMARKS: remark: replace_globalization.c:5:14: Replaced globalized variable with 4 bytes of shared memory
; CHECK-REMARKS-NOT: 6 bytes
+; CHECK-LIMIT: remark: replace_globalization.c:5:14: Replaced globalized variable with 4 bytes of shared memory
+; CHECK-LIMIT: remark: replace_globalization.c:5:7: Found thread data sharing on the GPU. Expect degraded performance due to data globalization
; UTC_ARGS: --enable
%struct.ident_t = type { i32, i32, i32, i32, i8* }
Index: llvm/lib/Transforms/IPO/OpenMPOpt.cpp
===================================================================
--- llvm/lib/Transforms/IPO/OpenMPOpt.cpp
+++ llvm/lib/Transforms/IPO/OpenMPOpt.cpp
@@ -129,6 +129,11 @@
cl::desc("Maximal number of attributor iterations."),
cl::init(256));
+static cl::opt<unsigned>
+ SharedMemoryLimit("openmp-opt-shared-limit", cl::Hidden,
+ cl::desc("Maximum amount of shared memory to use."),
+ cl::init(std::numeric_limits<unsigned>::max()));
+
STATISTIC(NumOpenMPRuntimeCallsDeduplicated,
"Number of OpenMP runtime calls deduplicated");
STATISTIC(NumOpenMPParallelRegionsDeleted,
@@ -3000,6 +3005,10 @@
auto *AllocSize = cast<ConstantInt>(CB->getArgOperand(0));
+ if (AllocSize->getZExtValue() + NumBytesMovedToSharedMemory >
+ SharedMemoryLimit)
+ continue;
+
LLVM_DEBUG(dbgs() << TAG << "Replace globalization call " << *CB
<< " with " << AllocSize->getZExtValue()
<< " bytes of shared memory\n");
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D120079.409754.patch
Type: text/x-patch
Size: 2632 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220217/b338f678/attachment.bin>
More information about the llvm-commits
mailing list