[llvm] [mlir] [Offload] Add oneInterationPerThread param to loop device RTL (PR #151959)

Thu Aug 7 04:48:27 PDT 2025

https://github.com/skatrak commented:

Thank you Dominik for this work. This generally looks fine to me, just some small comments.

According to your RFC, the no-loop mode is mainly intended for `target teams distribute parallel do` kernels with no `reduction` clause. However, in this patch you're updating the DeviceRTL functions for standalone `distribute` and `do` loops, in addition to the expected composite `distribute parallel do`. Do you expect supporting no-loop mode for `target teams distribute` and `target parallel do` to provide benefits as well? Do you know if this implementation would work properly for split nested `target teams distribute + parallel do` loops (i.e. both calls to `__kmpc_distribute_static_loop*` and `__kmpc_for_static_loop*` are present)?

https://github.com/llvm/llvm-project/pull/151959