[Openmp-commits] [PATCH] D64217: [OpenMP][NFCI] Cleanup the target state queue implementation
Alexey Bataev via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Thu Jul 11 09:37:42 PDT 2019
ABataev added a comment.
In D64217#1580895 <https://reviews.llvm.org/D64217#1580895>, @Hahnfeld wrote:
> Ideally, we can try to get rid of the target state queue by reducing the amount of information and using only shared memory. Without doubt, this would improve performance as Alexey mentioned and lighten the global memory usage (hundreds of MB to GB). Based on my experiments last year, I think that's doable iff we don't support nested parallelism (see thread on openmp-dev) and possibly "tweak" the spec such that we don't need to track some ICVs once the execution reaches an active parallel region (e.g., we don't need to care about //nthreads-var// if all nested regions are serialized, but currently the user might query the set value via `omp_get_max_threads`).
I would try to use lazy initialization for the cases where we need to track some ICVs. I did not try it yet but thought about it for some time.
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
More information about the Openmp-commits