[flang-commits] [flang] [flang][OpenMP] Fix LASTPRIVATE for iteration variables (PR #69773)

Leandro Lupori via flang-commits flang-commits at lists.llvm.org
Mon Oct 23 11:17:31 PDT 2023


luporl wrote:

Thanks for the review @kiranchandramohan!

> Is the update of the original loop variable unconditionally done now? Should it be only if `lastprivate(loop_variable)` is present? Or is the point that the `loop_variable` is always private and hence the value of the original variable is undefined if it is not `lastprivate` and hence it is OK to update always?

The update of the original loop variable is still performed only if `lastprivate(loop_variable)` is present. But the private copy is now being updated, in the last iteration, whenever a `lastprivate` clause is present, even if it doesn't include the loop variable. Do you think it's better to omit the store in this case? I've noticed that this store and the load after it are usually optimized, when llvm bitcode is generated, and only the store of `v` to the original loop variable remains.

> > I've just one point to raise (@kiranchandramohan can comment better). Through this approach, we have introduced a mandatory set of 1 arith.addi, 3 arith.cmpi, and 1 arith.select in every iteration of the loop. Again, not very sure about this, but can this have a bearing on performance, since loops are extensively present across benchmarks we use for validation?
> 
> Please refer to https://discourse.llvm.org/t/rfc-privatisation-in-openmp-dialect/3526/4. The `kmpc_for_static_init` calls provide a value called `%.omp.is_last` that can be used to check if this is the last iteration. And updates can be placed by checking this value on the last iteration in the exiting path of the loop. One way to achieve this is by having a new operation in the dialect (in-place of the fir.if and related comparisons) say, `omp.last_private_loop_update`, that takes in the loop_variable, step-size as arguments. This can be extracted out and put in the exit path of the loop. Other alternatives include just modelling the lastprivate variables as part of the omp.wsloop construct and appropriately lowering them with the correct conditions and updates only during lowering to LLVM IR.

If I understand correctly, this is a suggested optimization for a future patch, correct?
This would be especially useful when support for `lastprivate` with `collapse` is added. Otherwise, we would need to test the loop variables of each nested loop.

https://github.com/llvm/llvm-project/pull/69773


More information about the flang-commits mailing list