[all-commits] [llvm/llvm-project] 95a25e: [OpenMP][FIX] Do not use TBAA in type punning redu...
Johannes Doerfert via All-commits
all-commits at lists.llvm.org
Sun Aug 16 12:40:45 PDT 2020
Branch: refs/heads/master
Home: https://github.com/llvm/llvm-project
Commit: 95a25e4c3203f35e9f57f9fac620b4a21bffd6e1
https://github.com/llvm/llvm-project/commit/95a25e4c3203f35e9f57f9fac620b4a21bffd6e1
Author: Johannes Doerfert <johannes at jdoerfert.de>
Date: 2020-08-16 (Sun, 16 Aug 2020)
Changed paths:
M clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
A clang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
Log Message:
-----------
[OpenMP][FIX] Do not use TBAA in type punning reduction GPU code PR46156
When we implement OpenMP GPU reductions we use type punning a lot during
the shuffle and reduce operations. This is not always compatible with
language rules on aliasing. So far we generated TBAA which later allowed
to remove some of the reduce code as accesses and initialization were
"known to not alias". With this patch we avoid TBAA in this step,
hopefully for all accesses that we need to.
Verified on the reproducer of PR46156 and QMCPack.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D86037
Commit: aa27cfc1e7d7456325e951a4ba3ced405027f7d0
https://github.com/llvm/llvm-project/commit/aa27cfc1e7d7456325e951a4ba3ced405027f7d0
Author: Johannes Doerfert <johannes at jdoerfert.de>
Date: 2020-08-16 (Sun, 16 Aug 2020)
Changed paths:
M openmp/libomptarget/plugins/cuda/src/rtl.cpp
Log Message:
-----------
[OpenMP][CUDA] Cache the maximal number of threads per block (per kernel)
Instead of calling `cuFuncGetAttribute` with
`CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK` for every kernel invocation,
we can do it for the first one and cache the result as part of the
`KernelInfo` struct. The only functional change is that we now expect
`cuFuncGetAttribute` to succeed and otherwise propagate the error.
Ignoring any error seems like a slippery slope...
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D86038
Commit: 5272d29e2cb7c967c3016fa285f14edc7515d9bf
https://github.com/llvm/llvm-project/commit/5272d29e2cb7c967c3016fa285f14edc7515d9bf
Author: Johannes Doerfert <johannes at jdoerfert.de>
Date: 2020-08-16 (Sun, 16 Aug 2020)
Changed paths:
M openmp/libomptarget/plugins/cuda/src/rtl.cpp
Log Message:
-----------
[OpenMP][CUDA] Keep one kernel list per device, not globally.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D86039
Compare: https://github.com/llvm/llvm-project/compare/5f45f91de419...5272d29e2cb7
More information about the All-commits
mailing list