[Mlir-commits] [mlir] [OpenMP][MLIR] Add omp.distribute op to the OMP dialect (PR #67720)

Fri Nov 17 07:38:36 PST 2023

jsjodin wrote:

> The `omp.new_cli` operation is strictly not necessary. They could just be block arguments. We could go ahead with the nesting approach (with block arguments) and @jsjodin 's approach of providing CLI as operands of canonical loops and loop transformation ops. An example is given below
> 
> ```
> omp.distribute loops(%tloop)
> bb0 (%tloop : !omp.cli):   
>   omp.tile loops(%outer, %inner), construct(%tloop:!omp.cli) {
>   bb0 (%outer, %inner : !omp.cli, !omp.cli):
>     omp.canonical_loop %iv1 : i32 in [0, %tripcount), construct(%outer : !omp.cli){
>       omp.canonical_loop %iv2 : i32 in [0, %tc), construct(%inner : !omp.cli) {
>         %a = load %arrA[%iv1, %iv2] : memref<?x?xf32>
>         store %a, %arrB[%iv1, %iv2] : memref<?x?xf32>
>       }
>     }
>   }
> ```

I'm not quite sure I understand the example.  Are the construct(%cli... ) the "outputs" and the loops(%cli, ..) the ones being operated on, or do I have it backwards? Does the modified example below make sense?

 ```
omp.distribute loops(%tloop1) { 
  bb0 (%tloop1 : !omp.cli, %tloop2 : !omp.cli): // Is it legal to have %tloop2 without it being in omp.distribute? We only care about one of the loops.   
   omp.tile loops(%loop), construct(%tloop1:!omp.cli, %tloop2:!omp.cli) { // Input 1 loop, output 2 loops
   bb0 (%loop : !omp.cli):
       omp.canonical_loop %iv : i32 in [0, %tc), construct(%inner : !omp.cli) {
         %a = load %arrA[%iv] : memref<?x?xf32>
         store %a, %arrB[%iv] : memref<?x?xf32>
     }
   }
}
 ```

@Meinersbur had objections to the initial nested proposal because there were no values to refer to specific loops after transformations were done, but that seems to be fixed here.

https://github.com/llvm/llvm-project/pull/67720