[Mlir-commits] [mlir] [mlir][OpenMP] don't add compiler-generated barrier in single threaded code (PR #174105)
Tom Eccles
llvmlistbot at llvm.org
Mon Jan 5 06:06:36 PST 2026
https://github.com/tblah updated https://github.com/llvm/llvm-project/pull/174105
>From 1247eebb52857f69aab90f718125fbda0a89b3db Mon Sep 17 00:00:00 2001
From: Tom Eccles <tom.eccles at arm.com>
Date: Wed, 31 Dec 2025 16:02:41 +0000
Subject: [PATCH 1/2] [mlir][OpenMP] don't add compiler-generated barrier in
single threaded code
We add barriers to the firstprivate copy region when they are required
to avoid a race condition with the lastprivate clause.
The problem is that these barriers are added by the compiler not implied
by user code so it is the compiler's problem to avoid deadlock.
I came across a testcase whilst working on taskloop support that looks
a bit like this
```
!$omp parallel
!$omp single
!$omp taskloop firstprivate(a) lastprivate(a)
...
!$omp end single
!$omp end parallel
```
This is so that there are multiple threads for the generated tasks to be
distributed over, but we don't generate the tasks afresh in every
thread.
The problem comes when the taskloop requires a barrier to prevent the
datarace between firstprivate and lastprivate. This barrier will then be
generated inside of SINGLE and so only one thread will encounter the
barrier: leading to a deadlock.
This patch works around the problem by detecting this situation
statically and not generating the barrier. There are cases where we
cannot detect this statically (e.g. if the TASKLOOP is inside a function
call inside of SINGLE). The program will still deadlock in this case
after my patch. I'm unsure what the solution would be and want to fix
the common case in LLVM 22 before engaging in a longer discussion as to
whether there is a better way to handle the more general case.
Testing using wsloop because I want to land (or not) independently of
taskloop. Note that for wsloop it would be up to the programmer to
remember to use the nowait clause, but nowait cannot be used to control
generation of this barrier because it refers to the barrier after the
construct not after firstprivate copyin (before the construct
execution).
---
.../OpenMP/OpenMPToLLVMIRTranslation.cpp | 21 ++++++++-
.../LLVMIR/openmp-private-barrier-single.mlir | 43 +++++++++++++++++++
2 files changed, 63 insertions(+), 1 deletion(-)
create mode 100644 mlir/test/Target/LLVMIR/openmp-private-barrier-single.mlir
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 03d67a52853f6..8025d9ea64483 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -1678,6 +1678,25 @@ allocatePrivateVars(llvm::IRBuilderBase &builder,
return afterAllocas;
}
+// This can't always be determined statically, but when we can, it is good to
+// avoid generating compiler-added barriers which will deadlock the program.
+static bool opIsInSingleThread(mlir::Operation *op) {
+ while (mlir::Operation *parent = op->getParentOp()) {
+ if (mlir::isa<omp::SingleOp, omp::CriticalOp>(parent))
+ return true;
+
+ // e.g.
+ // omp.single {
+ // omp.parallel {
+ // op
+ // }
+ // }
+ if (mlir::isa<omp::ParallelOp>(parent))
+ return false;
+ }
+ return false;
+}
+
static LogicalResult copyFirstPrivateVars(
mlir::Operation *op, llvm::IRBuilderBase &builder,
LLVM::ModuleTranslation &moduleTranslation,
@@ -1731,7 +1750,7 @@ static LogicalResult copyFirstPrivateVars(
moduleTranslation.forgetMapping(copyRegion);
}
- if (insertBarrier) {
+ if (insertBarrier && !opIsInSingleThread(op)) {
llvm::OpenMPIRBuilder *ompBuilder = moduleTranslation.getOpenMPBuilder();
llvm::OpenMPIRBuilder::InsertPointOrErrorTy res =
ompBuilder->createBarrier(builder.saveIP(), llvm::omp::OMPD_barrier);
diff --git a/mlir/test/Target/LLVMIR/openmp-private-barrier-single.mlir b/mlir/test/Target/LLVMIR/openmp-private-barrier-single.mlir
new file mode 100644
index 0000000000000..1b7b1eeac6227
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-private-barrier-single.mlir
@@ -0,0 +1,43 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+// tests that no privatization barrier is added for operations inside of SINGLE
+
+omp.private {type = private} @_QFwsloop_privateEi_private_ref_i32 : i32
+
+llvm.func @foo_free(!llvm.ptr)
+
+omp.private {type = firstprivate} @_QFwsloop_privateEc_firstprivate_ref_c8 : !llvm.array<1 x i8> copy {
+^bb0(%arg0: !llvm.ptr, %arg1: !llvm.ptr):
+ %0 = llvm.load %arg0 : !llvm.ptr -> !llvm.array<1 x i8>
+ llvm.store %0, %arg1 : !llvm.array<1 x i8>, !llvm.ptr
+ omp.yield(%arg1 : !llvm.ptr)
+} dealloc {
+^bb0(%arg0: !llvm.ptr):
+ llvm.call @foo_free(%arg0) : (!llvm.ptr) -> ()
+ omp.yield
+}
+
+llvm.func @wsloop_private_(%arg0: !llvm.ptr {fir.bindc_name = "y"}) attributes {fir.internal_name = "_QPwsloop_private", frame_pointer = #llvm.framePointerKind<all>, target_cpu = "x86-64"} {
+ %0 = llvm.mlir.constant(1 : i64) : i64
+ %1 = llvm.alloca %0 x f32 {bindc_name = "x"} : (i64) -> !llvm.ptr
+ %3 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+ %5 = llvm.alloca %0 x !llvm.array<1 x i8> {bindc_name = "c"} : (i64) -> !llvm.ptr
+ %6 = llvm.mlir.constant(1 : i32) : i32
+ %7 = llvm.mlir.constant(10 : i32) : i32
+ %8 = llvm.mlir.constant(0 : i32) : i32
+ omp.parallel {
+ omp.single {
+ omp.wsloop private(@_QFwsloop_privateEc_firstprivate_ref_c8 %5 -> %arg1, @_QFwsloop_privateEi_private_ref_i32 %3 -> %arg2 : !llvm.ptr, !llvm.ptr) private_barrier {
+ // CHECK: omp.private.copy:
+ // CHECK-NOT: __kmpc_barrier
+ // CHECK: br label
+ omp.loop_nest (%arg4) : i32 = (%8) to (%7) inclusive step (%6) {
+ omp.yield
+ }
+ }
+ omp.terminator
+ }
+ omp.terminator
+ }
+ llvm.return
+}
>From 20840038dac8af7c88e92d82b14d390290b9c4bf Mon Sep 17 00:00:00 2001
From: Tom Eccles <tom.eccles at arm.com>
Date: Mon, 5 Jan 2026 14:06:08 +0000
Subject: [PATCH 2/2] Add comment
---
mlir/test/Target/LLVMIR/openmp-private-barrier-single.mlir | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/mlir/test/Target/LLVMIR/openmp-private-barrier-single.mlir b/mlir/test/Target/LLVMIR/openmp-private-barrier-single.mlir
index 1b7b1eeac6227..9dd66a34a17f8 100644
--- a/mlir/test/Target/LLVMIR/openmp-private-barrier-single.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-private-barrier-single.mlir
@@ -1,6 +1,10 @@
// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
-// tests that no privatization barrier is added for operations inside of SINGLE
+// Tests that no privatization barrier is added for operations inside of SINGLE.
+// The specific combination here (wsloop inside of single) may not be valid
+// OpenMP. This combination was chosen so that this patch did not have to wait
+// for taskloop support. If it ever becomes a problem having a wsloop inside
+// of single, this can be updated to use taskloop.
omp.private {type = private} @_QFwsloop_privateEi_private_ref_i32 : i32
More information about the Mlir-commits
mailing list