[llvm-branch-commits] [flang] [flang] Introduce custom loop nest generation for loops in workshare construct (PR #101445)
Sergio Afonso via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Mon Aug 26 04:48:33 PDT 2024
skatrak wrote:
> No you are right, sorry for the back and forth, as you said, since a wsloop can only be nested in a omp.parallel it is immediately obvious that it binds to the omp.parallel threads so that makes sense.
>
> My only concern was that at some point some transformation (perhaps in the future, because I don't think anything transforms `wsloop`s currently) could make the assumption that either all or none of the threads of the team an `omp.parallel` launches will execute the parent block of a `wsloop` that binds to that team. (for example an optimization/transformation could add an operation immediately before the wsloop which is supposed to be executed by all threads (or none) in the omp.parallel. that operation would then be erroneously wrapped in an omp.single in LowerWorkshare.)
>
> I thought this was a fair assumption for an optimization/transformation to make because if for example only one of the threads executes a wsloop it would not produce the intended result. So the intention was to guard against a potential error like that. Let me know if I am wrong here since I am sure people here have more experience than me on this.
Thank you for bringing attention to this potential issue I hadn't considered, I think it's a productive discussion. I'm not aware of any transformations like that existing at the moment either, but it looks like they would certainly break if they ran while the `omp.workshare` operation still remained. However, they would work if they ran after the pass lowering `omp.workshare` to a set of `omp.single` for the code in between `omp.wsloop`s. That way we would not have to introduce a new loop wrapper and also we could create passes assuming the parent of region of an `omp.wsloop` is executed by all threads in the team. I don't think that should be an issue, since in principle it makes sense to me that the `omp.workshare` transformation would run immediately after PFT to MLIR lowering. What do you think about that alternative?
https://github.com/llvm/llvm-project/pull/101445
More information about the llvm-branch-commits
mailing list