[Openmp-commits] [PATCH] D76630: [libomptarget][nfc] Use unity build for nvcc to approximate LTO
Jon Chesterfield via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Mon Mar 23 10:55:52 PDT 2020
JonChesterfield created this revision.
JonChesterfield added reviewers: ABataev, jdoerfert, grokos, ikitayama, tianshilei1992.
Herald added subscribers: openmp-commits, dexonsmith, inglorion, mgorny.
Herald added a project: OpenMP.
[libomptarget][nfc] Use unity build for nvcc to approximate LTO
Nvcc doesn't support link time optimization. Including all the source
into one file is common in games dev where it is called a unity build.
The advantage is that moving functions from header to source no longer
carries a performance penalty.
A secondary advantage is we'll be able to rename the C++ files to .cpp
without confusing nvcc, provided unity.cu retains the current suffix.
This is a NFC if there are no bugs in nvcc. As there may be, I'm hoping
a reviewer will run a larger out of tree test suite against this patch.
The in tree ones pass.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D76630
Files:
openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
openmp/libomptarget/deviceRTLs/nvptx/unity.cu
Index: openmp/libomptarget/deviceRTLs/nvptx/unity.cu
===================================================================
--- /dev/null
+++ openmp/libomptarget/deviceRTLs/nvptx/unity.cu
@@ -0,0 +1,27 @@
+//===------ unity.cu - Unity build of NVPTX deviceRTL ------------ CUDA -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Support compilers, specifically NVCC, which have not implemented link time
+// optimisation. This removes the runtime cost of moving inline functions into
+// source files in exchange for negligible build overhead.
+//
+//===----------------------------------------------------------------------===//
+
+#include "common/src/cancel.cu"
+#include "common/src/critical.cu"
+#include "common/src/data_sharing.cu"
+#include "common/src/libcall.cu"
+#include "common/src/loop.cu"
+#include "common/src/omp_data.cu"
+#include "common/src/omptarget.cu"
+#include "common/src/parallel.cu"
+#include "common/src/reduction.cu"
+#include "common/src/support.cu"
+#include "common/src/sync.cu"
+#include "common/src/task.cu"
+#include "src/target_impl.cu"
Index: openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
===================================================================
--- openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
+++ openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt
@@ -50,6 +50,7 @@
# propagating host flags.
set(CUDA_PROPAGATE_HOST_FLAGS OFF)
+ # Note: These files are also listed in unity.cu
set(cuda_src_files
${devicertl_common_directory}/src/cancel.cu
${devicertl_common_directory}/src/critical.cu
@@ -95,7 +96,7 @@
set(CUDA_SEPARABLE_COMPILATION ON)
list(APPEND CUDA_NVCC_FLAGS -I${devicertl_base_directory}
-I${devicertl_nvptx_directory}/src)
- cuda_add_library(omptarget-nvptx STATIC ${cuda_src_files} ${omp_data_objects}
+ cuda_add_library(omptarget-nvptx STATIC unity.cu
OPTIONS ${CUDA_ARCH} ${CUDA_DEBUG})
# Install device RTL under the lib destination folder.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D76630.252096.patch
Type: text/x-patch
Size: 2253 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20200323/4da43704/attachment-0001.bin>
More information about the Openmp-commits
mailing list