[llvm-dev] LLVM_DYLIB and CLANG_DYLIB with MSVC
Martin Storsjö via llvm-dev
llvm-dev at lists.llvm.org
Thu Jun 3 14:09:56 PDT 2021
On Sun, 30 May 2021, Cristian Adam via llvm-dev wrote:
> Due to the nature of MSVC regarding default visibility of symbols (hidden by
> default, whereas MinGW has visible by default), one needs to generate a .def
> file with the symbols needed to be exported.
> This is done already in two cases for LLVM_BUILD_LLVM_C_DYLIB
> (llvm/tools/llvm-shlib/gen-msvc-exports.py) and for
> LLVM_EXPORT_SYMBOLS_FOR_PLUGINS (llvm/utils/extract_symbols.py).
> I've put together a patch that enables LLVM_DYLIB and CLANG_DYLIB for MSVC.
> I tested with clang-cl from the official Clang 12 x64 Windows binary
> * Normal build: 1,42 GB
> * shlib build: 536 MB
> The shlib release build compiled and linked fine with LLVM.dll and
> clang-cpp.dll, unfortunately it crashes at runtime.
Without digging into the scripts, I have one hunch:
Does the def generator script differentiate between code and data symbols?
For the cases where accessing a data symbol from another DLL, the caller
would have to have seen a declaration with the dllimport attribute. For
functions, it doesn't matter (it just does an extra hop via the import
thunk), but for data variables it matters. If the def file would have
proper DATA annotations for such symbols, you would end up with linker
errors (where you'd have an undefined reference to dataSymbol, where the
import library only provides __imp_dataSymbol).
This is fixed up by the autoimport feature when linking in mingw mode
(which, in general, requires you to link against the mingw runtime too);
for cases where the caller references dataSymbol but you only have
__imp_dataSymbol available, the linker adds an entry to a list of pseudo
relocations, which the mingw runtime handles when loaded, which then maps
sections as writable and patches up the addresses to where they are
located in another DLL.
So to avoid this, we would either need to actually provide proper
dllimport declarations at least for all data symbols, or avoid cross-DLL
data accesses (by using e.g. accessor functions instead).
More information about the llvm-dev