[Mlir-commits] [mlir] [OpenMP][MLIR] Add omp.distribute op to the OMP dialect (PR #67720)

Fri Sep 29 10:26:07 PDT 2023

jsjodin wrote:

> Could you share your plan before we proceed? The proposal of this patch of a distribute operation that (possibly) nests a loop seems to be different from both the existing worksharing-loop design (that includes the loop) and the canonical-loop proposal under design where the distribute will accept a CLI value that represents a loop.

The plan is to have some representation of 'distribute' that does not rely on meta-values (CLI) for now, since there are still a lot of unanswered questions. The version of the omp.distribute op in this patch basically works as a wrapper op that modified how the contained loop(s) execute. It does rely on there being a contained loop or (several if we want to consider collapse).

To summarize the thoughts from below. Separating meta-ops (unroll, tile etc) that affect code generation from operations that execute and affect the semantics of the program seems to make sense. One possible way to separate them would  be for executable operations to use regions and nesting, and compile-time (meta) ops to use CLIs. In this case omp.distribute should use nesting. If it needs to produce a compile-time CLI value it could be added later on.

The concern about transformations interfering with the representation this is not very different from having a canonical loop with CLIs, since both solutions rely on transformations not knowing the semantics of the ops, which should prevent the code from being modified until lowering is done by the OpenMPIRBuilder. If there happen to be issues we can fall back on using the IsolatedFromAbove trait.

There are other aspects to the omp.canonical_loop where in my mind the operation somewhat problematic since it combines meta-values via yield and at the same time is a regular executable op. Maybe this is there because of SSA? There is also the discussion/issue about omp.structured_region and how to handle CLIs in presence of control flow, where it seems that a static view of the program seems to make sense, meaning all canonical loops should be propagated up to any containing canonical loop. Also, the restriction about not allowing operations being inserted between omp loop transformation ops is something that would be hard to enforce unless a container op was introduced that could enclose a sequence of meta-ops. I think something like this was proposed on discourse at some point. It would be cleaner imo if a omp.canonical_loop would create single CLI "handle" and to use container meta-ops to combine these values. As a side note, yield ops seem to generally be used to yield the induction variable to the next iteration in other loop constructs, so we might want to use a different op than omp.yield to propagate values..

https://github.com/llvm/llvm-project/pull/67720