[flang-commits] [flang] [WIP][flang][OpenMP] Experimental pass to map `do concurrent` to OMP (PR #77285)
Kareem Ergawy via flang-commits
flang-commits at lists.llvm.org
Tue Jan 9 01:23:16 PST 2024
ergawy wrote:
Thanks @kiranchandramohan for looking at this!
> Nice start.
>
> In case you have missed it, there are some recommendations for `do concurrent` in https://github.com/llvm/llvm-project/blob/main/flang/docs/DoConcurrent.md.
>
> I guess the things to consider are:
>
> 1. Is any additional analysis necessary before this conversion?
I also have the feeling that some analyses will be needed but I am not sure what exactly yet. Maybe a starting point would be to verify that input loops respect the constraints listed by the spec, for example: **C1121-C1137**. I know some of these constraints are enforced by the front-end already before lowering so we will have to consider them one by one and check the best place to check for these constraints. There is also section **11.1.7.5 Additional semantics for DO CONCURRENT constructs** in the spec which I think can investigated for further analyses before we apply the mapping to OMP. Do you have any particular analyses in mind outside of this?
> 2. How to map locality constraints, i believe they are currently handled in lowering. This will need to be delayed.
Indeed this is currently handled in lowering. Below is an example without `local_init` and then the same loop with `local_init`:
##### Without `local_init` this is the loop directly after lowering:
```
// do concurrent (integer :: j=i:10)
// end do
fir.do_loop %arg0 = %8 to %9 step %c1 unordered {
%10 = fir.convert %arg0 : (index) -> i32
fir.store %10 to %1#1 : !fir.ref<i32>
}
```
##### With `local_init` this is the loop directly after lowering:
```
// do concurrent (integer :: j=i:10) local_init(i)
// end do
fir.do_loop %arg0 = %8 to %9 step %c1 unordered {
%10 = fir.convert %arg0 : (index) -> i32
fir.store %10 to %1#1 : !fir.ref<i32>
%11 = fir.alloca i32 {bindc_name = "i", pinned, uniq_name = "_QFEi"}
%12:2 = hlfir.declare %11 {uniq_name = "_QFEi"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
%13 = fir.load %6#0 : !fir.ref<i32>
hlfir.assign %13 to %12#0 : i32, !fir.ref<i32>
}
```
##### So, one possibility I think would be:
- Extract the [omp.map_info op](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td#L1147) to to some shared location between `fir` and `omp` dialects.
- Use `map_info` to model locality constraints for `do concurrent` as well.
Do you see any blockers for such approach?
> 3. Ensure this is only enabled with driver flags.
Will do! For now, I suggest to just be a standalone pass until we flesh it out a bit more. Please let me know if you disagree.
____
In addition to what you mentioned, I think point number 4 would be how to model reductions. That's for the future though after the above points are addressed.
___
With that in-mind, do you mind if we review and merge the current PR as it is and defer further development to later PRs? I think doing all this in one PR won't be ideal.
https://github.com/llvm/llvm-project/pull/77285
More information about the flang-commits
mailing list