[Openmp-commits] [clang] [llvm] [mlir] [openmp] [LoopTiling][Clang][MLIR] Canonical Intra-tile Loops (PR #191114)
Michael Kruse via Openmp-commits
openmp-commits at lists.llvm.org
Wed Apr 22 08:41:12 PDT 2026
https://github.com/Meinersbur commented:
This PR applies the PredicatedNeeded independent of whter the intratile loop actually needs to be a canonical loop to make e.g. work:
```
#pragma omp for collapse(2)
#pragma omp tile sizes(2)
for (int i = 0; i < n; ++i)
```
but it is not for e.g.
```
#pragma omp for collapse(1)
#pragma omp tile sizes(2)
for (int i = 0; i < n; ++i)
```
The `min(.floor.iv + DimTileSize, NumIterations)` solution would generally be preferable if the loop does not need to be canonical.
Could you have a look whether
1. LLVM optimizes both to the same code anyway. Look at the LoopBoundSplit pass, but it should not even be necessary and ScalarEvolution being able to derive the BackedgeTakenCount as of we did it explicitly with the min-expression. Possibly some nsw flags help as well.
2. you can derive whether the loop needs to be canonical. This could be a crude heuristic, such as whether there is any other directive applied to the `getTransformed()`, regardless how how deep it does.
https://github.com/llvm/llvm-project/pull/191114
More information about the Openmp-commits
mailing list