[Openmp-commits] [clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293)

Roger Ferrer Ibáñez via Openmp-commits openmp-commits at lists.llvm.org
Fri Aug 8 09:05:25 PDT 2025


================
@@ -508,6 +512,43 @@ OMPInterchangeDirective::CreateEmpty(const ASTContext &C, unsigned NumClauses,
       SourceLocation(), SourceLocation(), NumLoops);
 }
 
+OMPFuseDirective *OMPFuseDirective::Create(
+    const ASTContext &C, SourceLocation StartLoc, SourceLocation EndLoc,
+    ArrayRef<OMPClause *> Clauses, unsigned NumLoops, unsigned NumLoopNests,
+    Stmt *AssociatedStmt, Stmt *TransformedStmt, Stmt *PreInits) {
+
+  OMPFuseDirective *Dir = createDirective<OMPFuseDirective>(
+      C, Clauses, AssociatedStmt, TransformedStmtOffset + 1, StartLoc, EndLoc,
+      NumLoops);
+  Dir->setTransformedStmt(TransformedStmt);
+  Dir->setPreInits(PreInits);
+  // The number of top level canonical nests could
+  // not match the total number of generated loops
+  // Example:
+  // Before fusion:
+  //   for (int i = 0; i < N; ++i)
+  //     for (int j = 0; j < M; ++j)
+  //       A[i][j] = i + j;
+  //
+  //   for (int k = 0; k < P; ++k)
+  //     B[k] = k * 2;
+  // Here, NumLoopNests = 2, but NumLoops = 3.
----------------
rofirrim wrote:

This is an interesting observation. 

Looking into it further, it looks like as if the root of the problem is that `OMPLoopTransformationDirective` inherits from `OMPLoopBasedDirective` which has a `NumAssociatedLoops`, `NumLoops` is often used: `tile` and `stripe` forward `NumLoops`, `unroll` always passes 1 which is sensible, `reverse` passes `NumLoops` which seems unexpected (I'd expect it to be passing 1?)

I internally tested hardcoding `NumLoops` to 0 for `OMPFuseDirective` (along with asserts for this invariant) and it seems to work fine, as it isn't used effectively in the analysis.

I don't think we should see `omp fuse `as a loop-based directive (in the sense that it is applied to a single loop or nest of loops) but `omp fuse` is the only loop-transformation like this (for now).

We could restructure the hierarchy but at first it seems overkill. Thoughts?



https://github.com/llvm/llvm-project/pull/139293


More information about the Openmp-commits mailing list