[libc-commits] [PATCH] D149527: [libc] Support global constructors and destructors on NVPTX
Joseph Huber via Phabricator via libc-commits
libc-commits at lists.llvm.org
Mon May 1 11:23:11 PDT 2023
jhuber6 added inline comments.
================
Comment at: libc/startup/gpu/nvptx/start.cpp:23
+extern "C" {
+// Nvidia's 'nvlink' linker does not provide these sections. We instead need
+// to manually create them and update the globals in the loader implememtation.
----------------
tra wrote:
> The comment is somewhat puzzling.
>
> The sections themselves would be created by whatever generates the object files, before the linker gets involved.
> IIRC from our exchange on discourse, the actual problem was that nvlink discards the sections it's not familiar with and that's why we can't just put the initializers into a known init/fini sections and have to rely on putting initializers among regular data and use explicit symbols to find them.
>
>
Sorry, I meant to say "symbols" here. Normally when the linker finds a `.fini_array` or `.init_array` section it will provide these symbol names to let you traverse the section. This i the behavior in `ld.lld` which is what AMD uses. Nvidia both does not provide a way to put these in the `.init_array` section, nor does the linker create these symbols if you were to force them to exist. The latter could be potentially solved by reinventing `nvlink` in `lld` the former is a more difficult problem. Maybe there's a way to hack around this in the PTX Compiler API.
================
Comment at: libc/test/IntegrationTest/test.cpp:75
+#if defined(__NVPTX__)
+// FIXME: For unknown reason the Nvidia toolchain defines this to be a single
+// byte and will error on a mismatched size otherwise.
----------------
tra wrote:
> What exactly is the 'toolchain' in this context?
I mean this as when going through a compilation targeting `nvptx64` e.g. `clang++ test.cpp --target=nvptx64-nvidia-cuda -march=sm_70 -c`
================
Comment at: libc/test/IntegrationTest/test.cpp:80
// Integration tests are linked with -nostdlib. BFD linker expects
// __dso_handle when -nostdlib is used.
void *__dso_handle = nullptr;
----------------
tra wrote:
> What exactly needs this symbol?
>
> I'm surprised we need to care about DSOs on NVPTX as we do not have any there.
>
> Googling around (https://stackoverflow.com/questions/34308720/where-is-dso-handle-defined) suggests that we may avoid the issue by compiling with `-fno-use-cxa-atexit`.
>
> @MaskRay -- any suggestions on what's the right way to deal with this?
That might work for these hermetic tests since we should provide the base `atexit`. @sivachandra do you think we could use this? It seems to be supported on both Clang and GCC.
================
Comment at: libc/utils/gpu/loader/nvptx/CMakeLists.txt:12-13
CUDA::cuda_driver
+ LLVMObject
+ LLVMSupport
)
----------------
tra wrote:
> Can you check how long the clean build of the tool with `-j 6` would take now? If it's in the ballpark of a minute or so, we can probably live with that. Otherwise we should build the tool along with clang/LLVM, similar to how we deal with `libc-hdrgen`.
Is this assume we need to rebuild the libraries? I figured these would be copied somewhere in the build environment. On my machine rebuilding the loader takes about five seconds.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D149527/new/
https://reviews.llvm.org/D149527
More information about the libc-commits
mailing list