[clang] [flang] [llvm] [openmp] [Clang][OpenMP][LoopTransformations] Add support for "#pragma omp fuse" loop transformation directive and "looprange" clause (PR #139293)
Roger Ferrer Ibáñez via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 15 09:16:04 PDT 2025
rofirrim wrote:
I'm a bit uncertain with what we want to do with `NumGeneratedLoopNests` and `NumGeneratedLoops`.
I understand that, outside of dependent contexts, this is some sort of synthesised attribute (in the base case from analysing the loop nests / canonical loop sequences) that can be used by an enclosing loop transformation to check it is still valid.
I wonder if an alternative approach is using a list of integers, one per loop representing the depth of the canonical loop contained in there. In lack of a better name, let's call this the GeneratedLoopSequence (`gls` in the examples, read the examples bottom-up)
```cpp
// after unroll gls = [], because it is not partial and there may not be loop anymore
#pragma omp unroll
// after fuse gls = [ 1 ]
#pragma omp fuse
// from syntax gls = [ 1, 1 ]
{
for (...) { }
for (...) { }
}
```
```cpp
// after fuse gls = [ 6, 1 ]
#pragma omp fuse looprange(2, 2)
// from syntax gls = [ 6, 1, 1 ]
{
// after tile gls = [ 6 ]
#pragma omp tile sizes(x, y, z)
// from syntax gls = [ 3 ]
for (...) {
for (...) {
for (...) {
}
}
}
// from syntax gls = [ 1 ]
for (...) { }
// from syntax gls = [ 1 ]
for (...) { }
}
```
```cpp
// after split gls = [ 1, 1]
#pragma omp split counts(a, b)
// from syntax, gls = [ 1 ]
for (...) { }
```
(For dependent contexts I was thinking on making the GeneratedLoopSequence an `std::optional`, so it is explicitly absent and can be told apart from `[]`)
But I wonder if this approach is enough. I was considering the `apply` clause, when we get to implement it. And maybe a list of integers is not enough?
```cpp
// after apply(unroll) gls = []
// after split gls = [ 1, 1 ]
#pragma omp split counts(a, b) apply(unroll)
// from syntax, gls = [ 1 ]
for (...) { }
```
```cpp
// after apply(unroll(2)), non-partial unroll the second loop, gls = [1, ?not a loop anymore? ]
// after split gls = [ 1, 1 ]
#pragma omp split counts(a, b) apply(unroll(2))
// from syntax, gls = [ 1 ]
for (...) { }
```
```cpp
// after apply(split(2) counts(c, d)), gls = [1, [1, 1] ] (?)
// after split gls = [ 1, 1 ]
#pragma omp split counts(a, b) apply(split(2) counts(c, d))
// from syntax, gls = [ 1 ]
for (...) { }
```
```cpp
// after apply(split counts(c, d)), gls = [[1, 1], [1, 1]] (???)
// after split gls = [ 1, 1 ]
#pragma omp split counts(a, b) apply(split counts(c, d))
// from syntax, gls = [ 1 ]
for (...) { }
```
Maybe there is no need to recursively represent all the nested transformation?
Other examples, from OpenMP, seem OK:
```cpp
void span_apply(double A[128][128])
{
// this is not a loop transformation but this is fine because gls is a singleton
// and collapse is 2 ≤ 4
#pragma omp for collapse(2)
// from apply(grid: reverse, interchange) (this affects the first two loops) gls = [ 4 ]
// from tile gls = [ 4 ]
#pragma omp tile sizes(16,16) apply(grid: interchange,reverse)
// from syntax gls = [ 2 ]
for (int i = 0; i < 128; ++i)
for (int j = 0; j < 128; ++j)
A[i][j] = A[i][j] + 1;
}
```
```cpp
void nested_apply(double A[100])
{
// after apply(reverse), gls = [ 2 ]
// after applyt(intratile: unroll partial(2)), gls = [ 2 ]
// after tile: gls = [ 2 ]
#pragma omp tile sizes(10) apply(intratile: unroll partial(2) apply(reverse))
// from syntax, gls = [ 1 ]
for (int i = 0; i < 100; ++i)
A[i] = A[i] + 1;
}
```
Thoughts?
https://github.com/llvm/llvm-project/pull/139293
More information about the llvm-commits
mailing list