[all-commits] [llvm/llvm-project] b4f844: [Libomptarget] Allow the device runtime to be comp...

Fri May 13 11:39:28 PDT 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: b4f8443d97baf390e3a1e64021e39790c410af9d
      https://github.com/llvm/llvm-project/commit/b4f8443d97baf390e3a1e64021e39790c410af9d
  Author: Joseph Huber <jhuber6 at vols.utk.edu>
  Date:   2022-05-13 (Fri, 13 May 2022)

  Changed paths:
    M openmp/libomptarget/DeviceRTL/include/Mapping.h
    M openmp/libomptarget/DeviceRTL/include/State.h
    M openmp/libomptarget/DeviceRTL/src/Configuration.cpp
    M openmp/libomptarget/DeviceRTL/src/Debug.cpp
    M openmp/libomptarget/DeviceRTL/src/Kernel.cpp
    M openmp/libomptarget/DeviceRTL/src/Mapping.cpp
    M openmp/libomptarget/DeviceRTL/src/Misc.cpp
    M openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
    M openmp/libomptarget/DeviceRTL/src/Reduction.cpp
    M openmp/libomptarget/DeviceRTL/src/State.cpp
    M openmp/libomptarget/DeviceRTL/src/Synchronization.cpp
    M openmp/libomptarget/DeviceRTL/src/Tasking.cpp
    M openmp/libomptarget/DeviceRTL/src/Utils.cpp
    M openmp/libomptarget/DeviceRTL/src/Workshare.cpp

  Log Message:
  -----------
  [Libomptarget] Allow the device runtime to be compiled for the host

Currently the OpenMP offloading device runtime is only expected to be
compiled for the specific architecture it's targeting. This is
problematic if we want to make compiling the device runtime more general
via the standar `clang` driver rather than invoking the clang front-end
directly. This patch addresses this by primarily changing the declare
type to `nohost` so the host will not contain any of this code.
Additionally we forward declare the functions that are defined via
variants, otherwise these would cause problems on the host.

Reviewed By: jdoerfert, tianshilei1992

Differential Revision: https://reviews.llvm.org/D125260

  Commit: ce0caf41bdd44366b9913a8afb3dd79d184687c6
      https://github.com/llvm/llvm-project/commit/ce0caf41bdd44366b9913a8afb3dd79d184687c6
  Author: Joseph Huber <jhuber6 at vols.utk.edu>
  Date:   2022-05-13 (Fri, 13 May 2022)

  Changed paths:
    M openmp/libomptarget/DeviceRTL/src/Debug.cpp
    M openmp/libomptarget/DeviceRTL/src/Mapping.cpp
    M openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
    M openmp/libomptarget/DeviceRTL/src/Reduction.cpp
    M openmp/libomptarget/DeviceRTL/src/State.cpp
    M openmp/libomptarget/DeviceRTL/src/Workshare.cpp

  Log Message:
  -----------
  [Libomptarget] Address existing warnings in the device runtime library

This patche attemps to address the current warnings in the OpenMP
offloading device runtime. Previously we did not see these because we
compiled the runtime without the standard warning flags enabled.
However, these warnings are used when we now build the static library
version of this runtime. This became extremely noisy when coupled with
the fact the we compile each file roughly 32 times when all the
architectures are considered. So it would be ideal to not have all these
warnings show up when building.

Most of these errors were simply implicit switch-case fallthroughs,
which can be addressed using C++17's fallthrough attribute. Additionally
there was a volatile variable that was being casted away. This is most
likely safe to remove because we cast it away before its even used and
didn't seem to affect anything in testing.

Depends on D125260

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D125339

  Commit: 002a63f937d91c0aad192f2d4997317fb277b32a
      https://github.com/llvm/llvm-project/commit/002a63f937d91c0aad192f2d4997317fb277b32a
  Author: Joseph Huber <jhuber6 at vols.utk.edu>
  Date:   2022-05-13 (Fri, 13 May 2022)

  Changed paths:
    M clang/lib/Basic/Targets/NVPTX.cpp
    M clang/test/OpenMP/driver-openmp-target.c

  Log Message:
  -----------
  [OpenMP] Add `__CUDA_ARCH__` definition when offloading with OpenMP

Currently we define the `__CUDA_ARCH__` macro only in CUDA mode. This
patch allows us to use this macro in OpenMP-offloading mode when
targeting NVPTX.

Reviewed By: tra, tianshilei1992

Differential Revision: https://reviews.llvm.org/D125256

  Commit: 5189f634a113b06fc2f2e8c6c021c0083f59bfb8
      https://github.com/llvm/llvm-project/commit/5189f634a113b06fc2f2e8c6c021c0083f59bfb8
  Author: Joseph Huber <jhuber6 at vols.utk.edu>
  Date:   2022-05-13 (Fri, 13 May 2022)

  Changed paths:
    M clang/lib/Driver/ToolChains/Clang.cpp

  Log Message:
  -----------
  [OpenMP] Don't include the device wrappers if -nostdinc is used

OpenMP uses several wrapper hearders to provide the definitions of
needed symbols contained in the host. However, some users may use the
`-nostdinc` option to override these definitions themselves. The OpenMP
wrapper headers are stored in the same location as the clang install. If
the user passes `-nostdinc` then this include directory is never looked
at by default which means that including these wrappers will always
fail. These headers should instead be included manually if they are
needed with a `-nostdinc` build.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D125265

  Commit: af757f89806e03229837425b77839498db470ef8
      https://github.com/llvm/llvm-project/commit/af757f89806e03229837425b77839498db470ef8
  Author: Joseph Huber <jhuber6 at vols.utk.edu>
  Date:   2022-05-13 (Fri, 13 May 2022)

  Changed paths:
    M clang/include/clang/Basic/LangOptions.def
    M clang/include/clang/Driver/Options.td
    M clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
    M clang/lib/Driver/ToolChains/Clang.cpp
    M clang/test/OpenMP/target_globals_codegen.cpp

  Log Message:
  -----------
  [OpenMP] Don't set device runtime debugging flags if using '-nogpulib'

We use globals to configure debugging at compile-time for the device
runtime. Because these are only used by the OpenMP runtime we shouldn't
define them if we aren't using the device runtime. When a user passes in
'-nogpulib' this indicates that we are not using the device runtime, so
we should check for the precense of this flag and not emit these globals
if used.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D125314

  Commit: 9ffa945c401ccd248e1e35fbbccb1860b253b290
      https://github.com/llvm/llvm-project/commit/9ffa945c401ccd248e1e35fbbccb1860b253b290
  Author: Joseph Huber <jhuber6 at vols.utk.edu>
  Date:   2022-05-13 (Fri, 13 May 2022)

  Changed paths:
    M openmp/libomptarget/CMakeLists.txt
    M openmp/libomptarget/plugins/CMakeLists.txt
    M openmp/libomptarget/plugins/amdgpu/CMakeLists.txt
    M openmp/libomptarget/plugins/common/elf_common/CMakeLists.txt
    M openmp/libomptarget/plugins/cuda/CMakeLists.txt
    M openmp/libomptarget/plugins/ve/CMakeLists.txt
    M openmp/libomptarget/src/CMakeLists.txt
    M openmp/libomptarget/tools/deviceinfo/CMakeLists.txt

  Log Message:
  -----------
  [Libomptarget] Remove global include directory from libomptarget

We used to globally include the libomptarget include directory for all
projects. This caused some conflicts with the other files named
"Debug.h". This patch changes the cmake to include these files via the
target include instead.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D125563

  Commit: 16b7a0b43b386a0cfde65060394d5296345ce9bb
      https://github.com/llvm/llvm-project/commit/16b7a0b43b386a0cfde65060394d5296345ce9bb
  Author: Joseph Huber <jhuber6 at vols.utk.edu>
  Date:   2022-05-13 (Fri, 13 May 2022)

  Changed paths:
    M openmp/libomptarget/DeviceRTL/CMakeLists.txt
    A openmp/libomptarget/DeviceRTL/src/CMakeLists.txt

  Log Message:
  -----------
  [Libomptarget] Build the device runtime as a static library

This patch adds the necessary CMake configuration to build a static
library version of the device runtime, `libomptarget.devicertl.a`.
Various improvements in how we handle static libraries and generating
offloading code should allow us to treat the device library as a regular
project without needing to invoke the clang front-end directly. Here we
generate a job for each offloading architecture supported. Each
offloading architecture will be embedded into the static library and
used as-needed by the host.

This library will primarily be used to replace the bitcode library when
performing LTO. Currently, we need to manually pass in the bitcode
library which requires foreknowledge of the offloading architecture.
This approach lets us handle that in the linker wrapper instead.
Furthermore this should improve our interface to the device runtime. We
can now build it fully under a release build and have all the expected
entry points, as well as supporting debug builds.

Depends on D125265 D125256 D125260 D125314 D125563

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D125315

  Commit: 4638ae3a8575d988df856116102c1ccd15583c00
      https://github.com/llvm/llvm-project/commit/4638ae3a8575d988df856116102c1ccd15583c00
  Author: Joseph Huber <jhuber6 at vols.utk.edu>
  Date:   2022-05-13 (Fri, 13 May 2022)

  Changed paths:
    M clang/lib/Driver/ToolChains/Clang.cpp
    M clang/lib/Driver/ToolChains/CommonArgs.cpp
    M clang/test/Driver/openmp-offload-gpu-new.c

  Log Message:
  -----------
  [OpenMP] Use the new OpenMP device static library when doing LTO

The previous patches allowed us to create a static library containing
all the device code. This patch uses that library to perform the device
runtime linking late when performing LTO. This in addition to
simplifying the libraries, allows us to transparently handle the runtime
library as-needed without needing Clang to manually pass the necessary
library in the linker wrapper job.

Depends on D125315

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D125333

Compare: https://github.com/llvm/llvm-project/compare/0a22dfcb11c0...4638ae3a8575