[Openmp-commits] [PATCH] D105697: [libomptarget][nfc] Drop dead code in parallel_51
Johannes Doerfert via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Fri Jul 9 09:21:08 PDT 2021
jdoerfert added a comment.
In D105697#2867330 <https://reviews.llvm.org/D105697#2867330>, @JonChesterfield wrote:
> The only code that executes between the increment and following decrement are two calls to barrier_simple_spmd, which do not read the parallel level. There can be no user code executing between the two parts deleted here, and from checking IR before and after this change opt already deletes it anyway. Dropping this dead code makes the source clearer (and compilation fractionally faster) at zero cost.
Could you please watch the webinar (https://www.openmp.org/events/webinar-a-compilers-view-of-the-openmp-api/) or read the tregion paper, in both it is explained what is happening here. The two barriers active workers and wait for them to finish. Workers can and do read the parallel level.
> I'm interested in reducing overhead in the current runtime because codegen for the simple spmd case looks credibly close to cuda that I'm hopeful the gap can be narrowed, which would be a big deal for benchmarks until the new runtime comes online.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D105697/new/
https://reviews.llvm.org/D105697
More information about the Openmp-commits
mailing list