[all-commits] [llvm/llvm-project] a1da74: [AMDGPU] Place global constructors in .init_array ...

Joseph Huber via All-commits all-commits at lists.llvm.org
Sat Apr 29 06:40:36 PDT 2023


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: a1da7461571cf1763136e22a018a20a271bb70b9
      https://github.com/llvm/llvm-project/commit/a1da7461571cf1763136e22a018a20a271bb70b9
  Author: Joseph Huber <jhuber6 at vols.utk.edu>
  Date:   2023-04-29 (Sat, 29 Apr 2023)

  Changed paths:
    M llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp
    M llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-constexpr-alias.ll
    M llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-existing.ll
    M llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll

  Log Message:
  -----------
  [AMDGPU] Place global constructors in .init_array and .fini_array

For the GPU, we emit external kernels that call the initializers and
constructors, however if we had a persistent kernel like in the `_start`
kernel for the `libc` project, we could initialize the standard way of
calling constructors. This patch adds new global variables containing
pointers to the constructors to be called. If these are placed in the
`.init_array` and `.fini_array` sections, then the backend will handle
them specially. The linker will then provide the `__init_array_` and
`__fini_array_` sections to traverse them. An implementation would look
like this.

```
extern uintptr_t __init_array_start[];
extern uintptr_t __init_array_end[];
extern uintptr_t __fini_array_start[];
extern uintptr_t __fini_array_end[];

using InitCallback = void(int, char **, char **);
using FiniCallback = void(void);

extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void
_start(int argc, char **argv, char **envp) {
  uint64_t init_array_size = __init_array_end - __init_array_start;
  for (uint64_t i = 0; i < init_array_size; ++i)
    reinterpret_cast<InitCallback *>(__init_array_start[i])(argc, argv, env);
  uint64_t fini_array_size = __fini_array_end - __fini_array_start;
  for (uint64_t i = 0; i < fini_array_size; ++i)
    reinterpret_cast<FiniCallback *>(__fini_array_start[i])();
}
```

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D149340


  Commit: 1b823abea74d5a43c4778a252f7d2d3a9a5768c2
      https://github.com/llvm/llvm-project/commit/1b823abea74d5a43c4778a252f7d2d3a9a5768c2
  Author: Joseph Huber <jhuber6 at vols.utk.edu>
  Date:   2023-04-29 (Sat, 29 Apr 2023)

  Changed paths:
    M libc/startup/gpu/amdgpu/CMakeLists.txt
    M libc/startup/gpu/amdgpu/start.cpp
    M libc/test/integration/startup/gpu/CMakeLists.txt
    A libc/test/integration/startup/gpu/init_fini_array_test.cpp

  Log Message:
  -----------
  [libc] Add support for global ctors / dtors for AMDGPU

This patch makes the necessary changes to support calling global
constructors and destructors on the GPU. The patch in D149340 allows the
`lld` linker to create the symbols pointing us to these globals. These
should be executed by a single thread, which is more difficult on the
GPU because all threads are active. I chose to use an atomic counter to
sync every thread on the GPU. This is very slow if you use more than a
few thousand threads, but for testing purposes it should be sufficient.

Depends on D149340 D149363

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D149398


Compare: https://github.com/llvm/llvm-project/compare/bc37be185577...1b823abea74d


More information about the All-commits mailing list