[llvm-branch-commits] [flang] [flang] Introduce custom loop nest generation for loops in workshare construct (PR #101445)

Wed Aug 28 20:41:10 PDT 2024

ivanradanov wrote:

> ... However, they would work if they ran after the pass lowering `omp.workshare` to a set of `omp.single` for the code in between `omp.wsloop`s. That way we would not have to introduce a new loop wrapper and also we could create passes assuming the parent of region of an `omp.wsloop` is executed by all threads in the team. I don't think that should be an issue, since in principle it makes sense to me that the `omp.workshare` transformation would run immediately after PFT to MLIR lowering. What do you think about that alternative?

Ideally, the `omp.workshare` lowering will run after the HLIF to FIR lowering, because missing the high level optimizations that HLFIR provides can result in very bad performance (unneeded temporary arrays, unnecessary copies, non-fused array computation, etc). The workshare lowering transforms the `omp.workshare.loop_wrapper`s into `omp.wsloop`s so they are gone after that.

Another factor is that there may not be PFT->loop lowerings for many constructs that need to be divided into units of work. so we may need to first generate HLFIR and alter the lowerings from HLFIR to FIR to get the `omp.wsloop` (or `omp.workshare.loop_wrapper`), which means that there will be portions of the pipeline (from PFT->HLFIR until HLFIR->FIR) where a `omp.wsloop` nested in an `omp.workshare` will be the wrong representation.

Are there any concerns with adding `omp.workshare.loop_wrapper`? I do not see that big of an overhead (maintenance or compile time) resulting from its addition, while it makes things clearer and more robust in my opinion.

https://github.com/llvm/llvm-project/pull/101445