[flang-commits] [flang] [flang] Simplify hlfir.sum total reductions. (PR #119482)
Tom Eccles via flang-commits
flang-commits at lists.llvm.org
Fri Dec 13 09:50:54 PST 2024
https://github.com/tblah approved this pull request.
This looks great to me. I don't want to hold this up on OpenMP concerns.
I can roughly imagine how this could be implemented for OpenMP (not for this patch). I think ultimately we will need the genBody callback to operate on scalars so that we can generate the right reduction clause implementation (or alternatively have an argument saying what the openmp intrinsic reduction kind is then use the lowering code to generate the `omp.declare_reduction`).
I'm imagining something like
```
omp.declare_reduction @something : type init {
^bb0(%allocatedThreadPrivateVar: type)
%init_val = // The arith.constant passed as the init value for the reduction variable
omp.yield(%init_val)
} combiner {
^bb0(%lhs: type, %rhs: type):
// generated by the genBody callback. e.g. for type == i32
%res = arith.addi %lhs, %rhs : i32
omp.yield(%res)
}
func.func [...] {
%fortran_variable:2 = hlfir.declare [...]
omp.wsloop reduction(@something %fortran_variable#0 -> %arg0 : !fir.ref<type>) {
omp.loop_nest /*indices are %arg1...%argn*/ {
%privatized_variable = hlfir.declare %arg0 [...]
// generated by genLoopWIthReductions:
%rhs = hlfir.designate [...]
// generated by genBody:
%res = operation %lhs %rhs : type
omp.yield
}
}
// The result has been stored to %fortran_variable
}
```
This would give some loss of generality on the indexing. Maybe there should be two callbacks: one for indexing and one for the "combiner" op. IMO this can be done in a later patch.
https://github.com/llvm/llvm-project/pull/119482
More information about the flang-commits
mailing list