[flang-commits] [flang] [flang][OpenMP] Rewrite `omp.loop` to semantically equivalent ops (PR #115443)

Tue Nov 26 07:48:35 PST 2024

================
@@ -0,0 +1,57 @@
+// RUN: fir-opt --omp-generic-loop-conversion %s | FileCheck %s
+
+omp.private {type = private} @_QFtarget_teams_loopEi_private_ref_i32 : !fir.ref<i32> alloc {
+^bb0(%arg0: !fir.ref<i32>):
+  omp.yield(%arg0 : !fir.ref<i32>)
+}
+
+func.func @_QPtarget_teams_loop() {
+  %i = fir.alloca i32
+  %i_map = omp.map.info var_ptr(%i : !fir.ref<i32>, i32) map_clauses(implicit, exit_release_or_enter_alloc) capture(ByCopy) -> !fir.ref<i32> {name = "i"}
+  omp.target map_entries(%i_map -> %arg0 : !fir.ref<i32>) {
+    omp.teams {
+      %c0 = arith.constant 0 : i32
+      %c10 = arith.constant 10 : i32
+      %c1 = arith.constant 1 : i32
+      omp.loop private(@_QFtarget_teams_loopEi_private_ref_i32 %arg0 -> %arg2 : !fir.ref<i32>) {
+        omp.loop_nest (%arg3) : i32 = (%c0) to (%c10) inclusive step (%c1) {
+          fir.store %arg3 to %arg2 : !fir.ref<i32>
+          omp.yield
+        }
+      }
+      omp.terminator
+    }
+    omp.terminator
+  }
+  return
+}
+
+// CHECK-LABEL: func.func @_QPtarget_teams_loop
+// CHECK:         omp.target map_entries(
+// CHECK-SAME:      %{{.*}} -> %[[I_ARG:[^[:space:]]+]] : {{.*}}) {
+// 
+// CHECK:           omp.teams {
+// 
+// TODO we probably need to move the `loop_nest` bounds ops from the `teams`
+// region to the `parallel` region to avoid making these values `shared`. We can
+// find the backward slices of these bounds that are within the `teams` region
+// and move these slices to the `parallel` op.
----------------
skatrak wrote:

I wouldn't say we don't have to worry about it, but rather there are some updates needed in other places. We'd have to make sure to enforce host-evaluation of loop bounds for `target teams loop` or equivalent construct nests in the `omp.target` verifier, update Flang lowering accordingly and then this pass seems to me that it should work unmodified for the tests in this PR.

However, what's expected to happen in this case (or equivalent Fortran)?
```c++
#pragma omp target teams
{
  #pragma omp loop
  for (...) { ... }

  #pragma omp loop
  for (...) { ... }
}
```

In that case, loop bounds wouldn't be host-evaluated if this pass introduces two `distribute parallel for` nests, resulting in something similar to what is being produced by this pass right now. If these values being shared causes issues, perhaps we would need to  introduce a firstprivate clause into the created `omp.parallel` ops.

https://github.com/llvm/llvm-project/pull/115443