[Mlir-commits] [mlir] [MLIR][OpenMP] Add omp.simd operation (PR #79843)

Sergio Afonso llvmlistbot at llvm.org
Fri Feb 2 06:31:18 PST 2024


skatrak wrote:

> Wow! That's impressive, thanks for compiling this. I have two comments:
> 
>     * I'd recommend to split combined constructs and composite constructs into distinct tables.
> 
>     * OpenMP 6.0 will greatly increase the number of these constructs, so a general solution will be desired in the long-run.

Thanks for giving it a look. With regards to the first point, the 5 composite constructs that I was able to identify are in the first 5 rows of the table. The rest are combined constructs which in some cases contain one of these composite constructs inside. This can be seen by looking at the second column, where nesting of an operation inside another represents 'combination' and 'composition' is represented by an individual operation including the name of multiple leaf constructs. Maybe the table can be split after the first 5 rows, but it seems misleading to me saying that e.g. `teams distribute simd` is combined because it represents TEAMS with a single DISTRIBUTE SIMD (composite) nested inside. That's why I put everything together in a single table.

Concerning the increase in the number of these constructs, what are these additions related to? Are they loop transformations or are there going to be significant additions to parallelism generation/control and work distribution constructs? Combined constructs can already be represented in a scalable way, through nesting of ops, so I was thinking that we'd only have to represent composite constructs related to the last two categories, because loop transformations would be handled independently as well. I don't know whether there are caveats to that, but my thinking was that after introducing `omp.canonical_loop` we could have something like the following:

```mlir
%cli = omp.canonical_loop %i... {
  BODY
}

// loop transformations (e.g. %1 = omp.tile %cli...) here, before execution resulting in a single loop nest stored in %loop

// !$omp do simd (composite construct)
omp.wssimdloop %loop <do,simd-clauses>

// !$omp parallel do (combined construct)
omp.parallel <parallel-clauses> {
  omp.wsloop %loop <do-clauses>
}

// !$omp teams distribute parallel do (combined construct with composite construct inside)
omp.target <target-clauses> {
  omp.teams <teams-clauses> {
    omp.distparwsloop %loop <distribute,parallel,do-clauses>
  }
}
```

In that case, we shouldn't hopefully have to add many new operations. Only the set of 5 above and those related to new parallelism generation/control and work distribution constructs. How `omp.canonical_loop` is going to be defined and used is still under discussion, so it may end up looking very differently to this, but the idea of using a set of single/composite operations to express how a loop is supposed to run, independently of its transformations, may make some sense.

Maybe we could instead split even these composite constructs into their leaf ops instead. Here, the composite ops are built in advance by chaining the loop returned by one as the input of the next, but I'm not sure if that's even an improvement:

```mlir
%cli = omp.canonical_loop %i... {
  BODY
}

// loop transformations (e.g. %1 = omp.tile %cli...) here, before execution resulting in a single loop nest stored in %loop

// !$omp do simd (composite construct)
%1 = omp.wsloop %loop <do-clauses>
%2 = omp.simdloop %1 <simd-clauses>
omp.execute %2

// !$omp parallel do (combined construct)
%1 = omp.wsloop %loop <do-clauses>
omp.parallel <parallel-clauses> {
  omp.execute %1
}

// !$omp target teams distribute parallel do (combined construct with composite construct inside)
%1 = omp.distribute %loop <distribute-clauses>
%2 = omp.parwsloop %1 <parallel,do-clauses> // Not sure if this could be expressed in a better way
omp.target <target-clauses> {
  omp.teams <teams-clauses> {
    omp.execute %2
  }
}
```

https://github.com/llvm/llvm-project/pull/79843


More information about the Mlir-commits mailing list