[flang-commits] [flang] [flang][OpenMP] Move reductions `loop` to `teams` when `loop` is mapped to `distribute` (PR #132920)
Kareem Ergawy via flang-commits
flang-commits at lists.llvm.org
Tue Mar 25 04:30:49 PDT 2025
https://github.com/ergawy created https://github.com/llvm/llvm-project/pull/132920
Follow-up to #132003, in particular, see
https://github.com/llvm/llvm-project/pull/132003#issuecomment-2739701936.
This PR extends reduction support for `loop` directives. Consider the following scenario:
```fortran
subroutine bar
implicit none
integer :: x, i
!$omp teams loop reduction(+: x)
DO i = 1, 5
call foo()
END DO
end subroutine
```
Note the following:
* According to the spec, the `reduction` clause will be attached to `loop` during earlier stages in the compiler.
* Additionally, `loop` cannot be mapped to `distribute parallel for` due to the call to a foreign function inside the loop's body.
* Therefore, `loop` must be mapped to `distribute`.
* However, `distribute` does not have `reduction` clauses.
* As a result, we have to move the `reduction`s from the `loop` to its parent `teams` directive, which is what is done by this PR.
>From 008ca28370d1d4b1e055ca03b00ac54491dce3a4 Mon Sep 17 00:00:00 2001
From: ergawy <kareem.ergawy at amd.com>
Date: Tue, 25 Mar 2025 05:56:16 -0500
Subject: [PATCH] [flang][OpenMP] Move reductions `loop` to `teams` when `loop`
is mapped to `distribute`
Follow-up to #132003, in particular, see
https://github.com/llvm/llvm-project/pull/132003#issuecomment-2739701936.
This PR extends reduction support for `loop` directives. Consider the
following scenario:
```fortran
subroutine bar
implicit none
integer :: x, i
!$omp teams loop reduction(+: x)
DO i = 1, 5
call foo()
END DO
end subroutine
```
Note the following:
* According to the spec, the `reduction` clause will be attached to
`loop` during earlier stages in the compiler.
* Additionally, `loop` cannot be mapped to `distribute parallel for`
due to the call to a foreign function inside the loop's body.
* Therefore, `loop` must be mapped to `distribute`.
* However, `distribute` does not have `reduction` clauses.
* As a result, we have to move the `reduction`s from the `loop` to its
parent `teams` directive, which is what is done by this PR.
---
.../OpenMP/GenericLoopConversion.cpp | 30 ++++++++++++++-
flang/test/Lower/OpenMP/loop-directive.f90 | 37 +++++++++++++++++++
2 files changed, 66 insertions(+), 1 deletion(-)
diff --git a/flang/lib/Optimizer/OpenMP/GenericLoopConversion.cpp b/flang/lib/Optimizer/OpenMP/GenericLoopConversion.cpp
index 74ad6330b11a7..8858de075d193 100644
--- a/flang/lib/Optimizer/OpenMP/GenericLoopConversion.cpp
+++ b/flang/lib/Optimizer/OpenMP/GenericLoopConversion.cpp
@@ -59,8 +59,36 @@ class GenericLoopConversionPattern
case GenericLoopCombinedInfo::TeamsLoop:
if (teamsLoopCanBeParallelFor(loopOp))
rewriteToDistributeParallelDo(loopOp, rewriter);
- else
+ else {
+ auto teamsOp = llvm::cast<mlir::omp::TeamsOp>(loopOp->getParentOp());
+ auto teamsBlockArgIface =
+ llvm::cast<mlir::omp::BlockArgOpenMPOpInterface>(*teamsOp);
+ auto loopBlockArgIface =
+ llvm::cast<mlir::omp::BlockArgOpenMPOpInterface>(*loopOp);
+
+ for (unsigned i = 0; i < loopBlockArgIface.numReductionBlockArgs();
+ ++i) {
+ mlir::BlockArgument loopRedBlockArg =
+ loopBlockArgIface.getReductionBlockArgs()[i];
+ mlir::BlockArgument teamsRedBlockArg =
+ teamsBlockArgIface.getReductionBlockArgs()[i];
+ rewriter.replaceAllUsesWith(loopRedBlockArg, teamsRedBlockArg);
+ }
+
+ for (unsigned i = 0; i < loopBlockArgIface.numReductionBlockArgs();
+ ++i) {
+ loopOp.getRegion().eraseArgument(
+ loopBlockArgIface.getReductionBlockArgsStart());
+ }
+
+ loopOp.removeReductionModAttr();
+ loopOp.getReductionVarsMutable().clear();
+ loopOp.removeReductionByrefAttr();
+ loopOp.removeReductionSymsAttr();
+
rewriteToDistribute(loopOp, rewriter);
+ }
+
break;
}
diff --git a/flang/test/Lower/OpenMP/loop-directive.f90 b/flang/test/Lower/OpenMP/loop-directive.f90
index 954985e2d64f1..44337ddfbcb12 100644
--- a/flang/test/Lower/OpenMP/loop-directive.f90
+++ b/flang/test/Lower/OpenMP/loop-directive.f90
@@ -358,3 +358,40 @@ subroutine multi_block_teams
end select
!$omp end target teams
end subroutine
+
+
+! Verifies that reductions are hoisted to the parent `teams` directive and removed
+! from the `loop` dreictive when `loop` is mapped to `distribute`.
+
+! CHECK-LABEL: func.func @_QPteams_loop_cannot_be_parallel_for_with_reductions
+subroutine teams_loop_cannot_be_parallel_for_with_reductions
+ implicit none
+ integer :: x, y, i, p
+
+ ! CHECK: %[[ADD_RED:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "_QF{{.*}}Ex"}
+ ! CHECK: %[[MUL_RED:.*]]:2 = hlfir.declare %{{.*}} {uniq_name = "_QF{{.*}}Ey"}
+ ! CHECK: omp.teams reduction(
+ ! CHECK-SAME: @add_reduction_i32 %[[ADD_RED]]#0 -> %[[ADD_RED_ARG:[^[:space:]]*]],
+ ! CHECK-SAME: @multiply_reduction_i32 %[[MUL_RED]]#0 -> %[[MUL_RED_ARG:.*]] : {{.*}}) {
+
+ ! CHECK: omp.distribute private(@{{.*}} %{{.*}} -> %{{.*}}, @{{.*}} %{{.*}} -> %{{.*}} : {{.*}}) {
+ ! CHECK: %[[ADD_RED_DECL:.*]]:2 = hlfir.declare %[[ADD_RED_ARG]] {uniq_name = "_QF{{.*}}Ex"}
+ ! CHECK: %[[MUL_RED_DECL:.*]]:2 = hlfir.declare %[[MUL_RED_ARG]] {uniq_name = "_QF{{.*}}Ey"}
+
+ ! CHECK: %[[ADD_RES:.*]] = arith.addi %{{.*}}, %{{.*}} : i32
+ ! CHECK: hlfir.assign %[[ADD_RES]] to %[[ADD_RED_DECL]]#0 : i32, !fir.ref<i32>
+
+ ! CHECK: %[[MUL_RES:.*]] = arith.muli %{{.*}}, %{{.*}} : i32
+ ! CHECK: hlfir.assign %[[MUL_RES]] to %[[MUL_RED_DECL]]#0 : i32, !fir.ref<i32>
+ ! CHECK: omp.yield
+ ! CHECK: }
+ ! CHECK: omp.terminator
+ ! CHECK: }
+ !$omp teams loop reduction(+: x) reduction(*: y) private(p)
+ DO i = 1, 5
+ call foo()
+ x = x + i
+ y = y * i
+ p = 42
+ END DO
+end subroutine
More information about the flang-commits
mailing list