<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - ManagedStatic init-order fiasco bug since Visual Studio 14.27 because of std::atomic value-init constructor"
href="https://bugs.llvm.org/show_bug.cgi?id=49027">49027</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>ManagedStatic init-order fiasco bug since Visual Studio 14.27 because of std::atomic value-init constructor
</td>
</tr>
<tr>
<th>Product</th>
<td>new-bugs
</td>
</tr>
<tr>
<th>Version</th>
<td>unspecified
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Windows NT
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>new bugs
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>amaiorano@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>htmldeveloper@gmail.com, llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>Hello, I work at Google on SwiftShader
(<a href="https://swiftshader.googlesource.com/SwiftShader">https://swiftshader.googlesource.com/SwiftShader</a>), a CPU-targeted Vulkan
driver, where we use LLVM's JIT compilation.
I recently added a call to llvm::llvm_shutdown(), which iterates through the
StaticList and destroys each element
(<a href="https://github.com/llvm/llvm-project/blob/62ec4ac90738a5f2d209ed28c822223e58aaaeb7/llvm/lib/Support/ManagedStatic.cpp#L81">https://github.com/llvm/llvm-project/blob/62ec4ac90738a5f2d209ed28c822223e58aaaeb7/llvm/lib/Support/ManagedStatic.cpp#L81</a>).
On MSVC, I tripped the assert in destroy "assert(DeleterFn && "ManagedStatic
not initialized correctly!");". I debugged what was going on, and it turned
out, I had multiple instances of the same ManagedStaticBase instances in the
StaticList, which normally should never happen. I debugged some more, and
figured out what's going on, which I will detail here.
The way ManagedStatic works is that global statics are replaced with this class
to lazily construct them on first access. Other global objects, like cl::opt
instances, in other TUs typically access these MangedStatic instances. To
ensure there are no global constructor order problems, ManagedStatic (and
ManagedStaticBase) are designed to be trivially_constructible types that are
assumed to be zero-initialized by virtue of being static globals. See this
comment here:
<a href="https://github.com/llvm/llvm-project/blob/62ec4ac90738a5f2d209ed28c822223e58aaaeb7/llvm/include/llvm/Support/ManagedStatic.h#L56">https://github.com/llvm/llvm-project/blob/62ec4ac90738a5f2d209ed28c822223e58aaaeb7/llvm/include/llvm/Support/ManagedStatic.h#L56</a>
The first variable beneath that comment is:
mutable std::atomic<void *> Ptr;
'Ptr' is a std::atomic, and up until C++20, this type is guaranteed to have a
default constructor that does not initialize the value (i.e.
trivially_constructible). However, with C++20, this has changed as now the
default constructor will value-initialize the std::atomic (i.e. no longer
trivially_constructible). This is described on cppreference here:
<a href="https://en.cppreference.com/w/cpp/atomic/atomic/atomic">https://en.cppreference.com/w/cpp/atomic/atomic/atomic</a>
1) The default constructor is trivial: no initialization takes place other
than zero initialization of static and thread-local objects. std::atomic_init
may be used to complete
initialization.
(until C++20)
1) Value-initializes the underlying object (i.e. with T()). The
initialization is not atomic. Use of the default constructor is ill-formed if
std::is_default_constructible_v<T>
is false.
(since C++20)
Now the problem is that Visual Studio, since at least version 14.27 (_MSC_VER
1927), has updated their std::atomic type to follow the C++20 rule for
construction - unconditionally, I might add. So now, with this change in their
standard library, ManagedStaticBase is no longer a trivially_constructible
type, which breaks its expected usage. So here's what happens now during global
init:
1. cl::opt ctor calls Option::addArgument, which calls
GlobalParser->addOption(this)
2. GlobalParser is a ManagedStatic, so in its operator*(), it calls
RegisterManagedStatic, which creates the object and sets Ptr (the
atomic<void*>) to it, and adds itself to StaticList.
3. At some point, since MangedStatic is no longer trivially_constructible, its
constructor is run during global init, calling the std::atomic constructor,
setting the previously set Ptr to 0.
4. After that, another cl::opt ctor gets called, accessing GlobalParser again,
and since its Ptr is 0 (again), it allocates another instance, sets Ptr to it,
and adds itself (again) to the StaticList.
This is how I end up with dupes in StaticList, which ends up failing during
llvm_shutdown.
I put together this Compiler Explorer example with demonstrates the problem:
<a href="https://godbolt.org/z/WvPrx6">https://godbolt.org/z/WvPrx6</a>
>From MSVC 19.27 up, Foo::Foo() will emit code that calls the std::atomic<void
*> constructor. Change the version to anything prior (19.25 and before, for
some reason 19.26 isn't there, so I'm not sure what would happen), and the call
to the std::atomic<void *> constructor is gone. Furthermore, uncomment the "#if
0" at the top to have the compilers emit an undefined error on PrintVal<> that
displays whether std::atomic is trivially_constructible. It is for Clang and
GCC and MSVC 19.25 and less, but is not for MSVC 19.27 and up.
The MSVC atomic header for 19.28 (the one I happen to have locally) has the
following constructor definition for class atomic:
constexpr atomic() noexcept(is_nothrow_default_constructible_v<_Ty>) :
_Base() {}
As it has a definition, this makes is non-trivial. I can modify it locally to
make it trivial like so:
constexpr atomic() noexcept(is_nothrow_default_constructible_v<_Ty>) =
default;
Doing this fixes the aforementioned problem.
Phew, I know that was a long explanation, but I just wanted to make sure the
problem is well understood.
Okay, so the issue here is the std::atomic constructor. Although one can argue
that Microsoft should conditionally compile the new atomic constructor
behaviour if __cplusplus > 2020xxL (not sure what the value is), I would say
that this needs to be fixed in LLVM as eventually it will be compiled with a
C++20-enabled compiler.
Some ideas for fixing this:
1. Revert the original change that made 'Ptr' a std::atomic<void*> rather than
a void*:
<a href="https://github.com/llvm/llvm-project/commit/832d0420781083009f758e01f1df59e4a00003b9">https://github.com/llvm/llvm-project/commit/832d0420781083009f758e01f1df59e4a00003b9</a>
(Phabricator:
<a href="https://reviews.llvm.org/rG832d0420781083009f758e01f1df59e4a00003b9">https://reviews.llvm.org/rG832d0420781083009f758e01f1df59e4a00003b9</a>). This
would be a perf regression, but would fix this problem for the future.
2. Make Ptr a void* again, but use atomic operations on it. Unfortunately, C++
atomic functions only works on std::atomic types. C++20 adds std::atomic_ref,
but that's not useful here, I think. So this means adding platform-agnostic
atomic operations on non-std::atomic types. Or add a custom Atomic class - I
think LLVM used to have one?
That's all I've got. If anyone has any ideas, please do share. I'd be happy to
put up a patch to fix this.
Thanks for reading this long!
- Antonio Maiorano
P.S. Not the same, but somewhat related:
<a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED FIXED - LLVM's ManagedStatic.h fails with VS 2019 <atomic> and clang-cl due to init-order-fiasco and constexpr confusion"
href="show_bug.cgi?id=41367">https://bugs.llvm.org/show_bug.cgi?id=41367</a></pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>