[all-commits] [llvm/llvm-project] a42361: [OpenMP] Expose the state in the header to allow n...

Johannes Doerfert via All-commits all-commits at lists.llvm.org
Thu Jul 21 10:41:16 PDT 2022


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: a42361dc1c26acae656243232e81a236ba333a8c
      https://github.com/llvm/llvm-project/commit/a42361dc1c26acae656243232e81a236ba333a8c
  Author: Johannes Doerfert <johannes at jdoerfert.de>
  Date:   2022-07-21 (Thu, 21 Jul 2022)

  Changed paths:
    M openmp/libomptarget/DeviceRTL/include/State.h
    M openmp/libomptarget/DeviceRTL/src/State.cpp

  Log Message:
  -----------
  [OpenMP] Expose the state in the header to allow non-lto optimizations

We used to inline the `lookup` calls such that the runtime had "known"
access offsets when it was shipped. With the new static library build it
doesn't as the lookup is an indirection we cannot look through. This
should help us optimize the code better until we can do LTO for the
runtime again.

Differential Revision: https://reviews.llvm.org/D130111


  Commit: 7472b42b788e57b7b1ea255aa8670c96cc0aacd8
      https://github.com/llvm/llvm-project/commit/7472b42b788e57b7b1ea255aa8670c96cc0aacd8
  Author: Johannes Doerfert <johannes at jdoerfert.de>
  Date:   2022-07-21 (Thu, 21 Jul 2022)

  Changed paths:
    M openmp/libomptarget/DeviceRTL/include/State.h
    M openmp/libomptarget/DeviceRTL/include/Utils.h

  Log Message:
  -----------
  [OpenMP] Use Undef instead of null as pointer for inactive lanes

Our conditional writes in the runtime look like this:
```
  if (active)
    *ptr = value;
```
In the RAII we need to assign `ptr` which comes from a lookup call.
If a thread that is not the main thread calls lookup with the intention
to write the pointer, we'll create a new thread state. As such, we need
to avoid calling lookup for inactive threads. We used to use `nullptr`
as their `ptr` value but that can cause pessimistic reasoning. We now
use `undef` instead.

Differential Revision: https://reviews.llvm.org/D130114


  Commit: d150152615074190d20492512da439cd5820b04a
      https://github.com/llvm/llvm-project/commit/d150152615074190d20492512da439cd5820b04a
  Author: Johannes Doerfert <johannes at jdoerfert.de>
  Date:   2022-07-21 (Thu, 21 Jul 2022)

  Changed paths:
    M openmp/libomptarget/DeviceRTL/include/State.h
    M openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
    M openmp/libomptarget/DeviceRTL/src/State.cpp

  Log Message:
  -----------
  [OpenMP] Introduce more fine-grained control over the thread state use

We can help optimizations by making sure we use the team state whenever
it is clear there is no thread state. To this end we introduce a new
state flag (`state::HasThreadState`) and explicit control for the
`state::ValueRAII` helpers, including a dedicated "assert equal".

Differential Revision: https://reviews.llvm.org/D130113


  Commit: 48d6f5240187573881f96cc9574ea09592f50723
      https://github.com/llvm/llvm-project/commit/48d6f5240187573881f96cc9574ea09592f50723
  Author: Johannes Doerfert <johannes at jdoerfert.de>
  Date:   2022-07-21 (Thu, 21 Jul 2022)

  Changed paths:
    M clang/lib/Headers/__clang_cuda_intrinsics.h
    A clang/test/CodeGenCUDA/shuffle_long_long.cu

  Log Message:
  -----------
  [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive

A copy-paste error caused UB in the definition of the unsigned long long
versions of the shfl intrinsics. Reported and diagnosed by @trws.

Differential Revision: https://reviews.llvm.org/D129536


Compare: https://github.com/llvm/llvm-project/compare/e01ce4e88a84...48d6f5240187


More information about the All-commits mailing list