[flang-commits] [clang] [flang] [flang][OpenMP] Upstream `do concurrent` loop-nest detection. (PR #127595)

Sergio Afonso via flang-commits flang-commits at lists.llvm.org
Thu Feb 20 04:47:26 PST 2025


================
@@ -53,6 +53,79 @@ that:
 * It has been tested in a very limited way so far.
 * It has been tested mostly on simple synthetic inputs.
 
+### Loop nest detection
+
+On the `FIR` dialect level, the following loop:
+```fortran
+  do concurrent(i=1:n, j=1:m, k=1:o)
+    a(i,j,k) = i + j + k
+  end do
+```
+is modelled as a nest of `fir.do_loop` ops such that an outer loop's region
+contains **only** the following:
+  1. The operations needed to assign/update the outer loop's induction variable.
+  1. The inner loop itself.
+
+So the MLIR structure for the above example looks similar to the following:
+```
+  fir.do_loop %i_idx = %34 to %36 step %c1 unordered {
+    %i_idx_2 = fir.convert %i_idx : (index) -> i32
+    fir.store %i_idx_2 to %i_iv#1 : !fir.ref<i32>
+
+    fir.do_loop %j_idx = %37 to %39 step %c1_3 unordered {
+      %j_idx_2 = fir.convert %j_idx : (index) -> i32
+      fir.store %j_idx_2 to %j_iv#1 : !fir.ref<i32>
+
+      fir.do_loop %k_idx = %40 to %42 step %c1_5 unordered {
+        %k_idx_2 = fir.convert %k_idx : (index) -> i32
+        fir.store %k_idx_2 to %k_iv#1 : !fir.ref<i32>
+
+        ... loop nest body goes here ...
+      }
+    }
+  }
+```
+This applies to multi-range loops in general; they are represented in the IR as
+a nest of `fir.do_loop` ops with the above nesting structure.
+
+Therefore, the pass detects such "perfectly" nested loop ops to identify multi-range
+loops and map them as "collapsed" loops in OpenMP.
+
+#### Further info regarding loop nest detection
+
+Loop nest detection is currently limited to the scenario described in the previous
+section. However, this is quite limited and can be extended in the future to cover
+more cases. For example, for the following loop nest, even though, both loops are
+perfectly nested; at the moment, only the outer loop is parallelized:
+```fortran
+do concurrent(i=1:n)
+  do concurrent(j=1:m)
+    a(i,j) = i * j
+  end do
+end do
+```
+
+Similarly, for the following loop nest, even though the intervening statement `x = 41`
+does not have any memory effects that would affect parallelization, this nest is
+not parallelized as well (only the outer loop is).
----------------
skatrak wrote:

```suggestion
not parallelized either (only the outer loop is).
```

https://github.com/llvm/llvm-project/pull/127595


More information about the flang-commits mailing list