[Openmp-commits] [PATCH] D76630: [libomptarget][nfc] Use unity build for nvcc to approximate LTO

Jon Chesterfield via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Mon Mar 23 10:55:52 PDT 2020


JonChesterfield created this revision.
JonChesterfield added reviewers: ABataev, jdoerfert, grokos, ikitayama, tianshilei1992.
Herald added subscribers: openmp-commits, dexonsmith, inglorion, mgorny.
Herald added a project: OpenMP.

[libomptarget][nfc] Use unity build for nvcc to approximate LTO

Nvcc doesn't support link time optimization. Including all the source
into one file is common in games dev where it is called a unity build.
The advantage is that moving functions from header to source no longer
carries a performance penalty.

A secondary advantage is we'll be able to rename the C++ files to .cpp
without confusing nvcc, provided unity.cu retains the current suffix.

This is a NFC if there are no bugs in nvcc. As there may be, I'm hoping
a reviewer will run a larger out of tree test suite against this patch.
The in tree ones pass.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D76630

Files:
  openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
  openmp/libomptarget/deviceRTLs/nvptx/unity.cu


Index: openmp/libomptarget/deviceRTLs/nvptx/unity.cu
===================================================================
--- /dev/null
+++ openmp/libomptarget/deviceRTLs/nvptx/unity.cu
@@ -0,0 +1,27 @@
+//===------ unity.cu - Unity build of NVPTX deviceRTL ------------ CUDA -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Support compilers, specifically NVCC, which have not implemented link time
+// optimisation. This removes the runtime cost of moving inline functions into
+// source files in exchange for negligible build overhead.
+//
+//===----------------------------------------------------------------------===//
+
+#include "common/src/cancel.cu"
+#include "common/src/critical.cu"
+#include "common/src/data_sharing.cu"
+#include "common/src/libcall.cu"
+#include "common/src/loop.cu"
+#include "common/src/omp_data.cu"
+#include "common/src/omptarget.cu"
+#include "common/src/parallel.cu"
+#include "common/src/reduction.cu"
+#include "common/src/support.cu"
+#include "common/src/sync.cu"
+#include "common/src/task.cu"
+#include "src/target_impl.cu"
Index: openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
===================================================================
--- openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
+++ openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
@@ -50,6 +50,7 @@
   # propagating host flags.
   set(CUDA_PROPAGATE_HOST_FLAGS OFF)
 
+  # Note: These files are also listed in unity.cu
   set(cuda_src_files
       ${devicertl_common_directory}/src/cancel.cu
       ${devicertl_common_directory}/src/critical.cu
@@ -95,7 +96,7 @@
   set(CUDA_SEPARABLE_COMPILATION ON)
   list(APPEND CUDA_NVCC_FLAGS -I${devicertl_base_directory}
                               -I${devicertl_nvptx_directory}/src)
-  cuda_add_library(omptarget-nvptx STATIC ${cuda_src_files} ${omp_data_objects}
+  cuda_add_library(omptarget-nvptx STATIC unity.cu
       OPTIONS ${CUDA_ARCH} ${CUDA_DEBUG})
 
   # Install device RTL under the lib destination folder.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D76630.252096.patch
Type: text/x-patch
Size: 2253 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20200323/4da43704/attachment-0001.bin>


More information about the Openmp-commits mailing list