[PATCH] D137470: [Offloading] Initial support for registering offloading entries on COFF targets

Tue Nov 15 14:49:08 PST 2022

jhuber6 added a comment.

In D137470#3928828 <https://reviews.llvm.org/D137470#3928828>, @mstorsjo wrote:

> Sorry, I'm not quite up to speed with exactly what is being done linker-wise here - can you give a more detailed overview? Keep in mind that there's two separate interfaces to lld for COFF; when used in mingw mode, it invokes the `ld.lld` frontend, but with a `-m` option which directs lld to the mingw frontend, which parses `ld.lld` style options and rewrites them to `lld-link` style options and invokes that interface. And when Clang is operating in msvc/clang-cl mode, `lld-link` is invoked (or called directly by the build system).

Sure, there's a bit of documentation <https://clang.llvm.org/docs/OffloadingDesign.html> for what's going on here, but I may need to update it a bit.

Basically, for offloading languages (CUDA, HIP, OpenMP, etc) we compile the source code twice, once for the host and once for the target device. We embed the device relocatable object inside the host so we follow a standard compilation pipeline. This `linker-wrapper` then fishes those relocatable objects out and performs the device-linking phase. The linked output is then put into a global along with some runtime calls to register the image and kernels. That new file gets passed to the wrapped linker job and we get a final executable.

My concern is that the linker wrapper keys off of certain arguments to the linker to do its job since it's invoked something like `clang-linker-wrapper <linker-args>`. I understand these are fundamentally different for `lld-link` so I was wondering if this approach in general would work there.

================
Comment at: clang/tools/clang-linker-wrapper/OffloadWrapper.cpp:145
+  // For COFF targets, sections with 8 or fewer characters containing a '$' will
+  // be merged into the same section at runtime. The order is determined by the
+  // alphebetical ordering of the text after the '$' character. Here we generate
----------------
mstorsjo wrote:
> jhuber6 wrote:
> > mstorsjo wrote:
> > > FWIW, this comment doesn't feel entirely accurate: Regardless of the length of the section name, all sections with names of the form `name$suffix` will get merged into the same section `name` (sorted by the suffix). Then if `name` is 8 chars or less, the name is kept in the section table (so that it can easily be looked up at runtime), while if it is longer, the full name is kept in the string table (which is not mapped at runtime).
> > > 
> > > Also as an extra side note; we added an exception into lld for `.eh_frame` - this is 9 chars, but libunwind wants to locate the section at runtime. So for that case, lld truncates it to `.eh_fram`. (This behaviour is lld specific, to appease libunwind - binutils doesn't do that, and libgcc locates that section differently.)
> > I see, I'm not that familiar with the inner workings of the COFF linking process. All that matters for this use-case is whether or not we can get a pointer to the array. In that case we shouldn't need to worry about the eight character limit right?
> If you locate the contents at runtime by using specific symbols that point to the start and end of the data, then yes, you don't need to worry about keeping it below the 8 char limit.
> 
> The 8 char limit is relevant if you enumerate and iterate over the sections of a DLL/EXE at runtime, and try to locate the section dynamically that way.
Good to know, I may change the section names to be more verbose then, something like `cuda.entries$OE`.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137470/new/

https://reviews.llvm.org/D137470