[flang-commits] [flang] [WIP][flang][OpenMP] Experimental pass to map `do concurrent` to OMP (PR #77285)

Kareem Ergawy via flang-commits flang-commits at lists.llvm.org
Tue Jan 9 01:23:16 PST 2024


ergawy wrote:

Thanks @kiranchandramohan for looking at this!

> Nice start.
> 
> In case you have missed it, there are some recommendations for `do concurrent` in https://github.com/llvm/llvm-project/blob/main/flang/docs/DoConcurrent.md.
> 
> I guess the things to consider are:
> 
>     1. Is any additional analysis necessary before this conversion?

I also have the feeling that some analyses will be needed but I am not sure what exactly yet. Maybe a starting point would be to verify that input loops respect the constraints listed by the spec, for example: **C1121-C1137**. I know some of these constraints are enforced by the front-end already before lowering so we will have to consider them one by one and check the best place to check for these constraints. There is also section **11.1.7.5 Additional semantics for DO CONCURRENT constructs** in the spec which I think can investigated for further analyses before we apply the mapping to OMP. Do you have any particular analyses in mind outside of this?

>     2. How to map locality constraints, i believe they are currently handled in lowering. This will need to be delayed.

Indeed this is currently handled in lowering. Below is an example without `local_init` and then the same loop with `local_init`:

##### Without `local_init` this is the loop directly after lowering:
```
    // do concurrent (integer :: j=i:10)
    // end do
    fir.do_loop %arg0 = %8 to %9 step %c1 unordered {
      %10 = fir.convert %arg0 : (index) -> i32
      fir.store %10 to %1#1 : !fir.ref<i32>
    }
```

##### With `local_init` this is the loop directly after lowering:
```
    //  do concurrent (integer :: j=i:10) local_init(i)
    //  end do
    fir.do_loop %arg0 = %8 to %9 step %c1 unordered {
      %10 = fir.convert %arg0 : (index) -> i32
      fir.store %10 to %1#1 : !fir.ref<i32>
      %11 = fir.alloca i32 {bindc_name = "i", pinned, uniq_name = "_QFEi"}
      %12:2 = hlfir.declare %11 {uniq_name = "_QFEi"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
      %13 = fir.load %6#0 : !fir.ref<i32>
      hlfir.assign %13 to %12#0 : i32, !fir.ref<i32>
    }
```

##### So, one possibility I think would be:
- Extract the [omp.map_info op](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td#L1147) to to some shared location between `fir` and `omp` dialects.
- Use `map_info` to model locality constraints for `do concurrent` as well.

Do you see any blockers for such approach?

>     3. Ensure this is only enabled with driver flags.

Will do! For now, I suggest to just be a standalone pass until we flesh it out a bit more. Please let me know if you disagree.
____

In addition to what you mentioned, I think point number 4 would be how to model reductions. That's for the future though after the above points are addressed.

___

With that in-mind, do you mind if we review and merge the current PR as it is and defer further development to later PRs? I think doing all this in one PR won't be ideal.

https://github.com/llvm/llvm-project/pull/77285


More information about the flang-commits mailing list