[PATCH] D95976: [OpenMP] Simplify offloading parallel call codegen
Johannes Doerfert via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Apr 16 15:41:09 PDT 2021
jdoerfert accepted this revision.
jdoerfert added a comment.
With the nit to add the two reproducers, LGTM. (please make sure to run FAROS or some benchmarks we have before commiting).
================
Comment at: openmp/libomptarget/deviceRTLs/common/src/parallel.cu:294
+ // TODO: Add UNLIKELY to optimize?
+ if (!if_expr) {
+ __kmpc_serialized_parallel(ident, global_tid);
----------------
ggeorgakoudis wrote:
> jdoerfert wrote:
> > This should allow us to remove the `SeqGen` in the Clang CodeGen *and* fix PR49777 *and* fix PR49779, a win-win-win situation.
> Please check
Check? Can we add the two reproducers as tests, please. One should be a clang test, the other maybe a runtime test, though clang test might suffice.
================
Comment at: openmp/libomptarget/utils/generate_microtask_cases.py:31
+if __name__ == "__main__":
+ main()
----------------
Great. The output is not pretty but that was not the objective ;)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D95976/new/
https://reviews.llvm.org/D95976
More information about the cfe-commits
mailing list