[llvm-dev] Supporting LLVM_BUILD_LLVM_DYLIB on Windows

Saleem Abdulrasool via llvm-dev llvm-dev at lists.llvm.org
Wed Sep 8 15:52:20 PDT 2021


Hello llvm-dev,

One of the current limitations on LLVM on Windows is that you cannot use
LLVM_BUILD_LLVM_DYLIB:
https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16
 I am interested in trying to see if we can lift this limitation.  There
are others in the community that also seem to be interested in seeing LLVM
being possible to use as a DLL on Windows and the topic does come up on the
mailing lists every so often.

When you build a distribution of a LLVM based toolchain currently, the
result on Windows is ~2GiB for a trimmed down toolset.  This is largely due
to the static linking used for all the tools.  I would like to be able to
use the shared LLVM build for building a toolset on Windows.

Unlike Unix platforms, the default on Windows is that all symbols are
treated as `dso_local` (that is `-fvisibility-default=hidden`).  Symbols
which are meant to participate in dynamic linking are to be attributed as
`__declspec(dllexport)` in the module and `__declspec(dllimport)` external
to the module.  This is similar to Unix platforms where
`__attribute__((__visibility__(...)))` controls the same type of behaviour
with `-fvisibility-default=hidden`.

For the case of distributions, it would remain valuable to minimize the
number of shared objects to reduce the files that require to be shipped but
also to minimize the number of cross-module calls which are not entirely
free (i.e. PLT+GOT or IAT costs).  At the same time, the number of possible
labels which can be exposed from a single module on Windows is limited to
64K.  Experience from MSys2 indicates that LLVM with all the backends is
likely to exceed this count (with a subset of targets, the number already
is close to 60K).  This means that it may be that we would need two
libraries on Windows.

With the LLVM community being diverse, people often build on different
platforms with different configurations, and I am concerned that adding
more differences in how we build libraries complicates how maintainable
LLVM is.  I would suggest that we actually change the behavior of the Unix
builds to match that of Windows by building with
`-fvisibility-default=hidden`.  Although this is a change, it is not
without value.  By explicitly marking the interfaces which are vended by a
library and making everything else internal, it does enable some potential
optimization options for the compiler and linker (to be clear, I am not
suggesting that this will have a guaranteed benefit, just that it can
potentially enable additional opportunities for optimizations and size
reductions).  This should incidentally help static linking.

In order to achieve this, we would need to have a module specific
annotation to indicate what symbols are meant to be used outside of the
module when built in a shared configuration.  The same annotation would
apply to all targets and is expected to be applied uniformly.  This of
course has a cost associated with it: the public interfaces would need to
be decorated appropriately.  However, by having the same behaviour on all
the platforms, developers would not be impacted by the platform differences
in their day-to-day development.  The only time that developers would need
to be aware of this is when they are working on the module boundary, that
is, changes which do not change the API surface of LLVM would not need to
consider the annotations.

Concretely, what I believe is required to enable building with
LLVM_BUILD_LLVM_DYLIB on Windows is:
- introduce module specific decoration (e.g. LLVM_SUPPORT_ABI, ...) to mark
public interfaces of shared library modules
- decorate all the public interfaces of the shared library modules with the
new decoration
- switching the builds to use `-fvisibility-default=hidden` by default

I believe that these can be done mostly independently and staged in the
order specified.  Until the last phase, it would have no actual impact on
the builds.  However, by staging it, we could allow others to experiment
with the option while it is under development, and allows for an easier
path for switching the builds over.

Although this would enable LLVM_BUILD_LLVM_DYLIB on Windows, give us better
uniformity between Windows and non-Windows platforms, potentially enable
additional optimization benefits, improve binary sizes for a distribution
of the toolchain (though less on Linux where distributors are already using
the build configuration ignoring the official suggestions in the LLVM
guides), and help with runtime costs of the toolchain (by making the core
of the tools a shared library, the backing pages can now be shared across
multiple instances), it is not entirely without downsides.  The primary
downsides that I see are:
- it becomes less enticing to support both LLVM_BUILD_LLVM_DYLIB and
BUILD_SHARED_LIBS: while technically possible, interfaces will need to be
decorated for both forms of the build
- LLVM_DYLIB_COMPONENTS becomes less tractable: in theory it is possible to
apply enough CPP magic to determine where a symbol is homed, but allowing a
symbol to be homed in a shared or static library is significantly more
complex
- BUILD_SHARED_LIBS becomes more expensive to maintain: the decoration is
per-module, which requires that we would need to decorate the symbols of
each module with module specific annotations as well

One argument that people make for BUILD_SHARED_LIBS is that it reduces the
overall time build-test cycle.  With the combination of lld, DWARF Fission,
and LLVM_BUILD_LLVM_DYLIB, I believe that most of the benefits still can be
had.  The cost of linking all the tools is amortized across the link of a
single library, which while not as small as the a singular library, is
offset by the following:
- The LLVM_BUILD_LLVM_DYLIB would not require the re-linking of all the
libraries for each tool.
- DWARF Fission would avoid the need to relink all of the DWARF information.
- lld is faster than the gold and bfd linkers

Header changes would still ripple through the system as before, requiring
rebuilding the transitive closure.  Source file changes do not have the
same impact of course.

For those would like a more concrete example of what a change like this may
shape up into: https://reviews.llvm.org/D109192 contains
`LLVMSupportExports.h` which has the expected structure for declaring the
decoration macros with the rest of the change primarily being focused on
applying the decoration.  Please ignore the CMake changes as they are there
to ensure that the CI validates this without changing the configuration and
not intended to be part of the final version of the change.

-- 
Saleem Abdulrasool
compnerd (at) compnerd (dot) org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210908/e7690ec7/attachment.html>


More information about the llvm-dev mailing list