[Mlir-commits] [flang] [mlir] [mlir][acc] Add ACCRecipeMaterialization pass and reduction ops (PR #184252)
llvmlistbot at llvm.org
llvmlistbot at llvm.org
Mon Mar 2 14:47:59 PST 2026
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-mlir
Author: Razvan Lupusoru (razvanlupusoru)
<details>
<summary>Changes</summary>
Pass
----
Add the `acc-recipe-materialization` pass, which materializes OpenACC privatization, firstprivate and reduction recipes by inlining their init, copy, combiner, and destroy regions into the operation for the construct. The pass runs on acc.parallel, acc.serial, acc.kernels, and acc.loop.
- Firstprivate: Inserts acc.firstprivate_map so the initial value is available on the device, then clones the recipe init and copy regions into the construct and replaces uses with the materialized alloca. Optional destroy region is cloned before the region terminator.
- Private: Clones the recipe init region into the construct (at region entry or at the loop op for acc.loop private). Replaces uses of the recipe result with the materialized alloca. Optional destroy region is cloned before the region terminator.
- Reduction: Creates acc.reduction_init (init region inlined) and acc.reduction_combine_region (combiner region inlined). All uses of the reduction in the region are updated to the reduction init result.
New operations
--------------
- acc.reduction_init: Allocates and initializes a private reduction variable from a recipe. Takes the original reduction variable and reduction_operator; has a single region that must yield one value (the private storage) via acc.yield. Used by the pass to materialize acc.reduction_recipe init regions inside the compute construct.
- acc.reduction_combine_region: Combines the private reduction value with the shared reduction variable. Takes the shared and private memrefs; has a single region (the recipe combiner) terminated by acc.yield with no operands. Used by the pass to materialize the reduction recipe combiner.
Both ops implement RegionBranchOpInterface. acc.yield is updated to allow terminating ReductionInitOp and ReductionCombineRegionOp regions.
Supporting changes
------------------
- OpenACCUtilsLoop: Factor cloneACCRegionInto out of the existing loop-conversion helper so the pass can clone recipe regions with optional result replacement; loop conversion now calls the shared helper.
- Flang: Add ReductionInitOpFortranObjectViewModel (FortranObjectViewOpInterface) for acc.reduction_init and register it in OpenACC extensions.
Tests
-----
- MLIR: acc-recipe-materialization-{firstprivate,private,reduction, kernel-private,parallel}.mlir (memref dialect).
- Flang: acc-recipe-materialization-{firstprivate,firstprivate-derived, private,reduction,kernel-private,parallel}.fir; firstprivate test has a second RUN with -acc-optimize-firstprivate-map.
---
Patch is 71.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/184252.diff
23 Files Affected:
- (modified) flang/include/flang/Optimizer/OpenACC/Support/FIROpenACCOpsInterfaces.h (+10)
- (modified) flang/lib/Optimizer/OpenACC/Support/FIROpenACCOpsInterfaces.cpp (+31)
- (modified) flang/lib/Optimizer/OpenACC/Support/RegisterOpenACCExtensions.cpp (+2)
- (added) flang/test/Transforms/OpenACC/acc-recipe-materialization-firstprivate-derived.fir (+60)
- (added) flang/test/Transforms/OpenACC/acc-recipe-materialization-firstprivate.fir (+56)
- (added) flang/test/Transforms/OpenACC/acc-recipe-materialization-kernel-private.fir (+45)
- (added) flang/test/Transforms/OpenACC/acc-recipe-materialization-parallel.fir (+50)
- (added) flang/test/Transforms/OpenACC/acc-recipe-materialization-private.fir (+47)
- (added) flang/test/Transforms/OpenACC/acc-recipe-materialization-reduction.fir (+50)
- (modified) mlir/include/mlir/Dialect/OpenACC/OpenACCCGOps.td (+66)
- (modified) mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td (+2-1)
- (modified) mlir/include/mlir/Dialect/OpenACC/OpenACCUtilsLoop.h (+14)
- (modified) mlir/include/mlir/Dialect/OpenACC/Transforms/Passes.td (+12)
- (modified) mlir/lib/Dialect/OpenACC/IR/OpenACCCG.cpp (+62)
- (added) mlir/lib/Dialect/OpenACC/Transforms/ACCRecipeMaterialization.cpp (+459)
- (modified) mlir/lib/Dialect/OpenACC/Transforms/CMakeLists.txt (+1)
- (modified) mlir/lib/Dialect/OpenACC/Utils/OpenACCUtilsLoop.cpp (+54-36)
- (added) mlir/test/Dialect/OpenACC/acc-recipe-materialization-firstprivate.mlir (+44)
- (added) mlir/test/Dialect/OpenACC/acc-recipe-materialization-kernel-private.mlir (+34)
- (added) mlir/test/Dialect/OpenACC/acc-recipe-materialization-parallel.mlir (+46)
- (added) mlir/test/Dialect/OpenACC/acc-recipe-materialization-private.mlir (+45)
- (added) mlir/test/Dialect/OpenACC/acc-recipe-materialization-reduction.mlir (+47)
- (modified) mlir/unittests/Dialect/OpenACC/OpenACCUtilsLoopTest.cpp (+86)
``````````diff
diff --git a/flang/include/flang/Optimizer/OpenACC/Support/FIROpenACCOpsInterfaces.h b/flang/include/flang/Optimizer/OpenACC/Support/FIROpenACCOpsInterfaces.h
index 7a68ee6234ece..4ffa0877ff190 100644
--- a/flang/include/flang/Optimizer/OpenACC/Support/FIROpenACCOpsInterfaces.h
+++ b/flang/include/flang/Optimizer/OpenACC/Support/FIROpenACCOpsInterfaces.h
@@ -14,6 +14,7 @@
#define FLANG_OPTIMIZER_OPENACC_FIROPENACC_OPS_INTERFACES_H_
#include "flang/Optimizer/Dialect/FIROperationMoveOpInterface.h"
+#include "flang/Optimizer/Dialect/FortranVariableInterface.h"
#include "mlir/Dialect/OpenACC/OpenACC.h"
namespace fir {
@@ -121,6 +122,15 @@ struct OperationMoveModel : public fir::OperationMoveOpInterface::ExternalModel<
bool canMoveOutOf(mlir::Operation *op, mlir::Operation *candidate) const;
};
+struct ReductionInitOpFortranObjectViewModel
+ : public fir::FortranObjectViewOpInterface::ExternalModel<
+ ReductionInitOpFortranObjectViewModel, mlir::acc::ReductionInitOp> {
+ mlir::Value getViewSource(mlir::Operation *op,
+ mlir::OpResult resultView) const;
+ std::optional<std::int64_t> getViewOffset(mlir::Operation *op,
+ mlir::OpResult resultView) const;
+};
+
} // namespace fir::acc
#endif // FLANG_OPTIMIZER_OPENACC_FIROPENACC_OPS_INTERFACES_H_
diff --git a/flang/lib/Optimizer/OpenACC/Support/FIROpenACCOpsInterfaces.cpp b/flang/lib/Optimizer/OpenACC/Support/FIROpenACCOpsInterfaces.cpp
index fc654e47cf0f1..8baf5b0d29105 100644
--- a/flang/lib/Optimizer/OpenACC/Support/FIROpenACCOpsInterfaces.cpp
+++ b/flang/lib/Optimizer/OpenACC/Support/FIROpenACCOpsInterfaces.cpp
@@ -17,10 +17,41 @@
#include "flang/Optimizer/HLFIR/HLFIROps.h"
#include "flang/Optimizer/Support/InternalNames.h"
#include "mlir/IR/SymbolTable.h"
+#include "mlir/Interfaces/ControlFlowInterfaces.h"
#include "llvm/ADT/SmallSet.h"
namespace fir::acc {
+mlir::Value ReductionInitOpFortranObjectViewModel::getViewSource(
+ mlir::Operation *op, mlir::OpResult resultView) const {
+ assert(resultView.getOwner() == op && "result value must be the op's result");
+ assert(op->getNumResults() == 1 &&
+ "definition of acc.reduction_init changed");
+ auto iface = mlir::cast<mlir::RegionBranchOpInterface>(op);
+ llvm::SmallVector<mlir::Value, 1> resultValues;
+ iface.getPredecessorValues(mlir::RegionSuccessor::parent(), /*index=*/0,
+ resultValues);
+ assert(!resultValues.empty() &&
+ "acc.reduction_init's result must have at least one possible value");
+ mlir::Value passThroughValue;
+ for (mlir::Value v : resultValues) {
+ if (!passThroughValue) {
+ passThroughValue = v;
+ continue;
+ }
+ assert(passThroughValue == v &&
+ "acc.reduction_init must return the same allocation");
+ }
+ return passThroughValue;
+}
+
+std::optional<std::int64_t>
+ReductionInitOpFortranObjectViewModel::getViewOffset(mlir::Operation *op,
+ mlir::OpResult resultView) const {
+ assert(resultView.getOwner() == op && "result value must be the op's result");
+ return 0;
+}
+
template <>
mlir::Value PartialEntityAccessModel<fir::ArrayCoorOp>::getBaseEntity(
mlir::Operation *op) const {
diff --git a/flang/lib/Optimizer/OpenACC/Support/RegisterOpenACCExtensions.cpp b/flang/lib/Optimizer/OpenACC/Support/RegisterOpenACCExtensions.cpp
index c0be247a47729..d7fadbc84ff1a 100644
--- a/flang/lib/Optimizer/OpenACC/Support/RegisterOpenACCExtensions.cpp
+++ b/flang/lib/Optimizer/OpenACC/Support/RegisterOpenACCExtensions.cpp
@@ -98,6 +98,8 @@ void registerOpenACCExtensions(mlir::DialectRegistry ®istry) {
mlir::acc::OpenACCDialect *dialect) {
mlir::acc::LoopOp::attachInterface<OperationMoveModel<mlir::acc::LoopOp>>(
*ctx);
+ mlir::acc::ReductionInitOp::attachInterface<
+ fir::acc::ReductionInitOpFortranObjectViewModel>(*ctx);
});
registerAttrsExtensions(registry);
diff --git a/flang/test/Transforms/OpenACC/acc-recipe-materialization-firstprivate-derived.fir b/flang/test/Transforms/OpenACC/acc-recipe-materialization-firstprivate-derived.fir
new file mode 100644
index 0000000000000..e8aea0b1ddc81
--- /dev/null
+++ b/flang/test/Transforms/OpenACC/acc-recipe-materialization-firstprivate-derived.fir
@@ -0,0 +1,60 @@
+// RUN: fir-opt %s -acc-recipe-materialization | FileCheck %s
+
+module {
+ acc.private.recipe @privatization_ref_i32 : !fir.ref<i32> init {
+ ^bb0(%arg0: !fir.ref<i32>):
+ %0 = fir.alloca i32
+ acc.yield %0 : !fir.ref<i32>
+ }
+ acc.firstprivate.recipe @firstprivatization_ref_rec__QFtestTpoint : !fir.ref<!fir.type<_QFtestTpoint{x:f32}>> init {
+ ^bb0(%arg0: !fir.ref<!fir.type<_QFtestTpoint{x:f32}>>):
+ %0 = fir.alloca !fir.type<_QFtestTpoint{x:f32}>
+ %1 = fir.declare %0 {uniq_name = "acc.private.init"} : (!fir.ref<!fir.type<_QFtestTpoint{x:f32}>>) -> !fir.ref<!fir.type<_QFtestTpoint{x:f32}>>
+ acc.yield %1 : !fir.ref<!fir.type<_QFtestTpoint{x:f32}>>
+ } copy {
+ ^bb0(%arg0: !fir.ref<!fir.type<_QFtestTpoint{x:f32}>>, %arg1: !fir.ref<!fir.type<_QFtestTpoint{x:f32}>>):
+ %0 = fir.field_index x, !fir.type<_QFtestTpoint{x:f32}>
+ %1 = fir.coordinate_of %arg0, x : (!fir.ref<!fir.type<_QFtestTpoint{x:f32}>>) -> !fir.ref<f32>
+ %2 = fir.field_index x, !fir.type<_QFtestTpoint{x:f32}>
+ %3 = fir.coordinate_of %arg1, x : (!fir.ref<!fir.type<_QFtestTpoint{x:f32}>>) -> !fir.ref<f32>
+ %4 = fir.load %1 : !fir.ref<f32>
+ fir.store %4 to %3 : !fir.ref<f32>
+ acc.terminator
+ }
+ func.func @_QPtest(%arg0: !fir.ref<!fir.type<_QFtestTpoint{x:f32}>> {fir.bindc_name = "p"}) {
+ %c1_i32 = arith.constant 1 : i32
+ %0 = fir.dummy_scope : !fir.dscope
+ %1 = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtestEi"}
+ %2 = fir.declare %1 {uniq_name = "_QFtestEi"} : (!fir.ref<i32>) -> !fir.ref<i32>
+ %3 = fir.alloca i32 {bindc_name = "n", uniq_name = "_QFtestEn"}
+ %4 = fir.declare %3 {uniq_name = "_QFtestEn"} : (!fir.ref<i32>) -> !fir.ref<i32>
+ %5 = fir.declare %arg0 dummy_scope %0 {uniq_name = "_QFtestEp"} : (!fir.ref<!fir.type<_QFtestTpoint{x:f32}>>, !fir.dscope) -> !fir.ref<!fir.type<_QFtestTpoint{x:f32}>>
+ %6 = acc.firstprivate varPtr(%5 : !fir.ref<!fir.type<_QFtestTpoint{x:f32}>>) recipe(@firstprivatization_ref_rec__QFtestTpoint) -> !fir.ref<!fir.type<_QFtestTpoint{x:f32}>> {name = "p"}
+ acc.parallel firstprivate(%6 : !fir.ref<!fir.type<_QFtestTpoint{x:f32}>>) {
+ %7 = fir.load %4 : !fir.ref<i32>
+ %8 = acc.private varPtr(%2 : !fir.ref<i32>) recipe(@privatization_ref_i32) -> !fir.ref<i32> {implicit = true, name = "i"}
+ acc.loop private(%8 : !fir.ref<i32>) control(%arg1 : i32) = (%c1_i32 : i32) to (%7 : i32) step (%c1_i32 : i32) {
+ %9 = fir.declare %8 {uniq_name = "_QFtestEi"} : (!fir.ref<i32>) -> !fir.ref<i32>
+ fir.store %arg1 to %9 : !fir.ref<i32>
+ %10 = fir.load %9 : !fir.ref<i32>
+ %11 = fir.convert %10 : (i32) -> f32
+ %12 = fir.field_index x, !fir.type<_QFtestTpoint{x:f32}>
+ %13 = fir.coordinate_of %5, x : (!fir.ref<!fir.type<_QFtestTpoint{x:f32}>>) -> !fir.ref<f32>
+ fir.store %11 to %13 : !fir.ref<f32>
+ acc.yield
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
+ acc.yield
+ }
+ return
+ }
+}
+
+// CHECK: %[[VAL_7:.*]] = acc.firstprivate_map varPtr(%{{.*}} : !fir.ref<!fir.type<_QFtestTpoint{x:f32}>>) -> !fir.ref<!fir.type<_QFtestTpoint{x:f32}>>
+// CHECK: acc.parallel {
+// CHECK: %[[VAL_8:.*]] = fir.alloca !fir.type<_QFtestTpoint{x:f32}>
+// CHECK: %[[VAL_9:.*]] = fir.declare %[[VAL_8]] {acc.var_name = #acc.var_name<"p">, uniq_name = "acc.private.init"} : (!fir.ref<!fir.type<_QFtestTpoint{x:f32}>>) -> !fir.ref<!fir.type<_QFtestTpoint{x:f32}>>
+// CHECK: %[[VAL_10:.*]] = fir.field_index x, !fir.type<_QFtestTpoint{x:f32}>
+// CHECK: %[[VAL_11:.*]] = fir.coordinate_of %[[VAL_7]], x : (!fir.ref<!fir.type<_QFtestTpoint{x:f32}>>) -> !fir.ref<f32>
+// CHECK: %[[VAL_12:.*]] = fir.field_index x, !fir.type<_QFtestTpoint{x:f32}>
+// CHECK: %[[VAL_13:.*]] = fir.coordinate_of %[[VAL_9]], x : (!fir.ref<!fir.type<_QFtestTpoint{x:f32}>>) -> !fir.ref<f32>
+// CHECK: %[[VAL_14:.*]] = fir.load %[[VAL_11]] : !fir.ref<f32>
diff --git a/flang/test/Transforms/OpenACC/acc-recipe-materialization-firstprivate.fir b/flang/test/Transforms/OpenACC/acc-recipe-materialization-firstprivate.fir
new file mode 100644
index 0000000000000..a2f1dbc6e6565
--- /dev/null
+++ b/flang/test/Transforms/OpenACC/acc-recipe-materialization-firstprivate.fir
@@ -0,0 +1,56 @@
+// RUN: fir-opt %s -acc-recipe-materialization | FileCheck %s
+// RUN: fir-opt %s -acc-recipe-materialization -acc-optimize-firstprivate-map | FileCheck %s --check-prefix=CHECK-MAP
+
+module {
+ acc.private.recipe @privatization_ref_i32 : !fir.ref<i32> init {
+ ^bb0(%arg0: !fir.ref<i32>):
+ %0 = fir.alloca i32
+ acc.yield %0 : !fir.ref<i32>
+ }
+ acc.firstprivate.recipe @firstprivatization_ref_i32 : !fir.ref<i32> init {
+ ^bb0(%arg0: !fir.ref<i32>):
+ %0 = fir.alloca i32
+ acc.yield %0 : !fir.ref<i32>
+ } copy {
+ ^bb0(%arg0: !fir.ref<i32>, %arg1: !fir.ref<i32>):
+ %0 = fir.load %arg0 : !fir.ref<i32>
+ fir.store %0 to %arg1 : !fir.ref<i32>
+ acc.terminator
+ }
+ func.func @firstpriv() {
+ %c1336_i32 = arith.constant 1336 : i32
+ %0 = fir.dummy_scope : !fir.dscope
+ %1 = fir.alloca i32 {bindc_name = "t", uniq_name = "_QFfirstprivEt"}
+ %2 = fir.declare %1 {uniq_name = "_QFfirstprivEt"} : (!fir.ref<i32>) -> !fir.ref<i32>
+ fir.store %c1336_i32 to %2 : !fir.ref<i32>
+ %3 = acc.firstprivate varPtr(%2 : !fir.ref<i32>) recipe(@firstprivatization_ref_i32) -> !fir.ref<i32> {implicit = true, name = "t"}
+ acc.parallel firstprivate(%3 : !fir.ref<i32>) {
+ %c1_i32 = arith.constant 1 : i32
+ %4 = fir.load %3 : !fir.ref<i32>
+ %5 = arith.addi %4, %c1_i32 : i32
+ fir.store %5 to %3 : !fir.ref<i32>
+ acc.yield
+ }
+ return
+ }
+}
+
+// Verify that the firstprivate was materialized into a copy outside the kernel
+// and an alloca (as per the recipe) inside the region.
+// Then ensure that all uses are of the private alloca.
+// CHECK-LABEL: func.func @firstpriv
+// CHECK: acc.parallel
+// CHECK: %[[ALLOCA:.*]] = fir.alloca i32 {{.*}}acc.var_name = #acc.var_name<"t">
+// CHECK: %[[FIRSTPRIVLOAD:.*]] = fir.load %{{.*}} : !fir.ref<i32>
+// CHECK: fir.store %[[FIRSTPRIVLOAD]] to %[[ALLOCA]] : !fir.ref<i32>
+// CHECK: %[[ALLOCALOAD:.*]] = fir.load %[[ALLOCA]] : !fir.ref<i32>
+// CHECK: %[[ADDI:.*]] = arith.addi %[[ALLOCALOAD]], %c1{{.*}} : i32
+// CHECK: fir.store %[[ADDI]] to %[[ALLOCA]] : !fir.ref<i32>
+
+// CHECK-MAP-LABEL: func.func @firstpriv
+// CHECK-MAP: fir.load {{.*}} : !fir.ref<i32>
+// CHECK-MAP: acc.parallel {
+// CHECK-MAP-NOT: acc.firstprivate_map
+// CHECK-MAP: fir.alloca i32 {{.*}}acc.var_name = #acc.var_name<"t">
+// CHECK-MAP: fir.store {{.*}} to {{.*}} : !fir.ref<i32>
+// CHECK-MAP: arith.addi {{.*}} %c1
diff --git a/flang/test/Transforms/OpenACC/acc-recipe-materialization-kernel-private.fir b/flang/test/Transforms/OpenACC/acc-recipe-materialization-kernel-private.fir
new file mode 100644
index 0000000000000..646cd6891781e
--- /dev/null
+++ b/flang/test/Transforms/OpenACC/acc-recipe-materialization-kernel-private.fir
@@ -0,0 +1,45 @@
+// RUN: fir-opt %s -acc-recipe-materialization | FileCheck %s
+
+// acc.kernels with private: recipe materialized to alloca inside region
+// CHECK-NOT: acc.private
+// CHECK: fir.alloca i32 {{.*}}acc.var_name = #acc.var_name<"s">
+
+acc.private.recipe @privatization_ref_i32 : !fir.ref<i32> init {
+^bb0(%arg0: !fir.ref<i32>):
+ %0 = fir.alloca i32
+ acc.yield %0 : !fir.ref<i32>
+}
+func.func @kpriv_(%arg0: !fir.ref<i32> {fir.bindc_name = "start"}, %arg1: !fir.ref<!fir.array<32xi32>> {fir.bindc_name = "a"}) attributes {fir.internal_name = "_QPkpriv"} {
+ %c1 = arith.constant 1 : index
+ %c32 = arith.constant 32 : index
+ %0 = fir.dummy_scope : !fir.dscope
+ %1 = fir.shape %c32 : (index) -> !fir.shape<1>
+ %2 = fir.declare %arg1(%1) dummy_scope %0 arg 2 {uniq_name = "_QFkprivEa"} : (!fir.ref<!fir.array<32xi32>>, !fir.shape<1>, !fir.dscope) -> !fir.ref<!fir.array<32xi32>>
+ %3 = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFkprivEi"}
+ %4 = fir.declare %3 {uniq_name = "_QFkprivEi"} : (!fir.ref<i32>) -> !fir.ref<i32>
+ %5 = fir.alloca i32 {bindc_name = "s", uniq_name = "_QFkprivEs"}
+ %6 = fir.declare %5 {uniq_name = "_QFkprivEs"} : (!fir.ref<i32>) -> !fir.ref<i32>
+ %7 = fir.declare %arg0 dummy_scope %0 arg 1 {fortran_attrs = #fir.var_attrs<intent_in>, uniq_name = "_QFkprivEstart"} : (!fir.ref<i32>, !fir.dscope) -> !fir.ref<i32>
+ %8 = fir.load %7 : !fir.ref<i32>
+ fir.store %8 to %6 : !fir.ref<i32>
+ %9 = acc.copyin varPtr(%2 : !fir.ref<!fir.array<32xi32>>) -> !fir.ref<!fir.array<32xi32>> {dataClause = #acc<data_clause acc_copy>, implicit = true, name = "a"}
+ %10 = acc.private varPtr(%6 : !fir.ref<i32>) recipe(@privatization_ref_i32) -> !fir.ref<i32> {implicit = true, name = "s"}
+ acc.kernels dataOperands(%9 : !fir.ref<!fir.array<32xi32>>) private(%10 : !fir.ref<i32>) {
+ %11 = fir.shape %c32 : (index) -> !fir.shape<1>
+ acc.loop control(%arg2 : index) = (%c1 : index) to (%c32 : index) step (%c1 : index) {
+ %12 = fir.alloca i32 {bindc_name = "i"}
+ %13 = fir.declare %12 {uniq_name = "_QFkprivEi"} : (!fir.ref<i32>) -> !fir.ref<i32>
+ %14 = fir.convert %arg2 : (index) -> i32
+ fir.store %14 to %13 : !fir.ref<i32>
+ %15 = fir.load %10 : !fir.ref<i32>
+ %16 = fir.load %13 : !fir.ref<i32>
+ %17 = fir.convert %16 : (i32) -> i64
+ %18 = fir.array_coor %9(%11) %17 : (!fir.ref<!fir.array<32xi32>>, !fir.shape<1>, i64) -> !fir.ref<i32>
+ fir.store %15 to %18 : !fir.ref<i32>
+ acc.yield
+ } attributes {inclusiveUpperbound = array<i1: true>, independent = [#acc.device_type<none>]}
+ acc.terminator
+ }
+ acc.copyout accPtr(%9 : !fir.ref<!fir.array<32xi32>>) to varPtr(%2 : !fir.ref<!fir.array<32xi32>>) {dataClause = #acc<data_clause acc_copy>, implicit = true, name = "a"}
+ return
+}
diff --git a/flang/test/Transforms/OpenACC/acc-recipe-materialization-parallel.fir b/flang/test/Transforms/OpenACC/acc-recipe-materialization-parallel.fir
new file mode 100644
index 0000000000000..97828fd66d1c6
--- /dev/null
+++ b/flang/test/Transforms/OpenACC/acc-recipe-materialization-parallel.fir
@@ -0,0 +1,50 @@
+// RUN: fir-opt %s -acc-recipe-materialization | FileCheck %s
+
+// Test that the reduction recipes are correctly inlined when attached to a
+// parallel construct without loop. Verify init and combine materialize in the region.
+// CHECK-LABEL: func.func @par_reduction_clause_
+// CHECK: acc.parallel {
+// CHECK: [[PRIVATE:%.*]] = acc.reduction_init {{.*}} <add>
+// CHECK-NEXT: [[ZERO:%.*]] = arith.constant 0.000000e+00 : f64
+// CHECK-NEXT: [[ALLOCA:%.*]] = fir.alloca f64
+// CHECK-NEXT: {{.*}} = fir.declare [[ALLOCA]] {{.*}}acc.reduction.init
+// CHECK-NEXT: fir.store [[ZERO]] to {{.*}}
+// CHECK-NEXT: acc.yield {{.*}}
+// CHECK: } {{.*}}acc.var_name = #acc.var_name<"tmp">
+// CHECK: fir.load [[PRIVATE]]
+// CHECK: fir.store {{.*}} to [[PRIVATE]]
+// CHECK: acc.reduction_combine_region [[PRIVATE]] into [[REDUCVAR:%.*]] :
+// CHECK: [[LOADVAR:%.*]] = fir.load [[REDUCVAR]]
+// CHECK-NEXT: [[LOADPRIV:%.*]] = fir.load [[PRIVATE]]
+// CHECK-NEXT: [[COMBINE:%.*]] = arith.addf [[LOADVAR]], [[LOADPRIV]]
+// CHECK-NEXT: fir.store [[COMBINE]] to [[REDUCVAR]]
+// CHECK: acc.yield
+
+acc.reduction.recipe @reduction_add_ref_f64 : !fir.ref<f64> reduction_operator <add> init {
+^bb0(%arg0: !fir.ref<f64>):
+ %cst = arith.constant 0.000000e+00 : f64
+ %0 = fir.alloca f64
+ %1 = fir.declare %0 {uniq_name = "acc.reduction.init"} : (!fir.ref<f64>) -> !fir.ref<f64>
+ fir.store %cst to %1 : !fir.ref<f64>
+ acc.yield %1 : !fir.ref<f64>
+} combiner {
+^bb0(%arg0: !fir.ref<f64>, %arg1: !fir.ref<f64>):
+ %0 = fir.load %arg0 : !fir.ref<f64>
+ %1 = fir.load %arg1 : !fir.ref<f64>
+ %2 = arith.addf %0, %1 fastmath<contract> : f64
+ fir.store %2 to %arg0 : !fir.ref<f64>
+ acc.yield %arg0 : !fir.ref<f64>
+}
+func.func @par_reduction_clause_(%arg0: !fir.ref<f64> {fir.bindc_name = "tmp"}) attributes {fir.internal_name = "_QPpar_reduction_clause"} {
+ %cst = arith.constant 1.000000e+00 : f64
+ %0 = fir.dummy_scope : !fir.dscope
+ %1 = fir.declare %arg0 dummy_scope %0 {uniq_name = "_QFpar_reduction_clauseEtmp"} : (!fir.ref<f64>, !fir.dscope) -> !fir.ref<f64>
+ %2 = acc.reduction varPtr(%1 : !fir.ref<f64>) recipe(@reduction_add_ref_f64) -> !fir.ref<f64> {name = "tmp"}
+ acc.parallel reduction(%2 : !fir.ref<f64>) {
+ %3 = fir.load %2 : !fir.ref<f64>
+ %4 = arith.addf %3, %cst fastmath<contract> : f64
+ fir.store %4 to %2 : !fir.ref<f64>
+ acc.yield
+ }
+ return
+}
diff --git a/flang/test/Transforms/OpenACC/acc-recipe-materialization-private.fir b/flang/test/Transforms/OpenACC/acc-recipe-materialization-private.fir
new file mode 100644
index 0000000000000..998762afabd4b
--- /dev/null
+++ b/flang/test/Transforms/OpenACC/acc-recipe-materialization-private.fir
@@ -0,0 +1,47 @@
+// RUN: fir-opt %s -acc-recipe-materialization | FileCheck %s
+
+acc.private.recipe @privatization_ref_i64 : !fir.ref<i64> init {
+^bb0(%arg0: !fir.ref<i64>):
+ %0 = fir.alloca i64
+ acc.yield %0 : !fir.ref<i64>
+}
+
+// CHECK-LABEL: func.func @private_i64
+// CHECK: acc.loop control([[IV:%.+]] : i64)
+// CHECK: [[ALLOC:%.+]] = fir.alloca i64
+// CHECK: [[DECL:%.+]] = fir.declare [[ALLOC]] {uniq_name = "_private_arg0"}
+// CHECK: fir.store [[IV]] to [[DECL]]
+
+func.func @private_i64(%arg0 : !fir.ref<i64>) {
+ %c16_i32 = arith.constant 16 : i32
+ %c1_i32 = arith.constant 1 : i32
+ %priv = acc.private varPtr(%arg0 : !fir.ref<i64>) recipe(@privatization_ref_i64) -> !fir.ref<i64> {implicit = true, name = ""}
+ acc.loop private(%priv : !fir.ref<i64>) control(%siv : i64) = (%c1_i32 : i32) to (%c16_i32 : i32) step (%c1_i32 : i32) {
+ %priv_decl = fir.declare %priv {uniq_name = "_private_arg0"} : (!fir.ref<i64>) -> !fir.ref<i64>
+ fir.store %siv to %priv_decl : !fir.ref<i64>
+ acc.yield
+ } attributes {independent = [#acc.device_type<none>]}
+ return
+}
+
+// CHECK-LABEL: func.func @par_private_i64
+// CHECK: acc.parallel {
+// CHECK: [[ALLOC:%.+]] = fir.alloca i64
+// CHECK: [[DECL:%.+]] = fir.declare [[ALLOC]] {uniq_name = "_private_arg0"}
+// CHECK: acc.loop control([[IV:%.+]] : i64)
+// CHECK: fir.store [[IV]] to [[DECL]]
+
+func.func @par_private_i64(%arg0 : !fir.ref<i64>) {
+ %c16_i32 = arith.constant 16 : i32
+ %c1_i32 = arith.constant 1 : i32
+ %priv = acc.private varPtr(%arg0 : !fir.ref<i64>) recipe(@privatization_ref_i64) -> !fir.ref<i64> {implicit = true, name = ""}
+ acc.parallel private(%priv : !fir.ref<i64>) {
+ %priv_decl = fir.declare %priv {uniq_name = "_private_arg0"} : (!fir.ref<i64>) -> !fir.ref<i64>
+ acc.loop control(%siv : i64) = (%c1_i32 : i32) to (%c16_i32 : i32) step (%c1_i32 : i32) {
+ fir.store %siv to %priv_decl : !fir.ref<i64>
+ acc.yield
+ } attributes {independent = [#acc.device_type<none>]}
+ acc.yield
+ }
+ return
+}
diff --git a/flang/test/Transforms/OpenACC/acc-recipe-materialization-reduction.fir b/flang/test/Transforms/OpenACC/acc-recipe-materialization-reduction.fir
new file mode 100644
index 0000000000000..a209fab123eff
--- /dev/null
+++ b/flang/test/Transforms/OpenACC/acc-recipe-materialization-reduction.fir
@@ -0,0 +1,50 @@
+// RUN: fir-opt %s -acc-recipe-materialization | FileCheck %s
+
+// Verify that the reduction init and combine recipes attached to compute
+// ops materialize within the region
+// CHECK-LABEL: func.func @par_reduction_clause_
+// CHECK: acc.parallel {
+// CHECK: [[PRIVATE:%.*]] = acc.reduction_init {{.*}} <add>
+// CHECK-NEXT: [[ZERO:%.*]] = arith.constant 0.000000e+00 : f64
+// CHECK-NEXT: [[ALLOCA:%.*]] = fir.alloca f64
+// CHECK-NEXT: {{.*}} = fir.declare [[ALLOCA]] {{.*}}acc.reduction.init
+// CHECK-NEXT: fir.store [[ZERO]] to {{.*}}
+// CHECK-NEXT: acc.yield {{.*}}
+// CHECK: } {{.*}}acc.var_name = #acc.var_name<"tmp">
...
[truncated]
``````````
</details>
https://github.com/llvm/llvm-project/pull/184252
More information about the Mlir-commits
mailing list