[PATCH] D126341: Order implicitly instantiated global variable's initializer by the reverse instantiation order

Richard Smith - zygoloid via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed May 25 16:44:32 PDT 2022


rsmith added a comment.

> The underlying problem is basically wg21.link/cwg362 which has no concensus yet.

CWG362 has a clear consensus and has been closed with that consensus for 18 years. The consensus, per that issue, is:

> We discussed this and agreed that **we really do mean the the order is unspecified**.

which seems clear that our current behavior is a valid choice.

In D126341#3537947 <https://reviews.llvm.org/D126341#3537947>, @rnk wrote:

> First, why should these guarantees be limited to instantiations and not inline variables? Such as:
>
>   int f();
>   inline int gv1 = f();
>   inline int gv2 = gv1 + 1; // rely on previous

Inline variables have initialization order guarantees already. An inline variable `B` can rely on a prior inline variable `A` being initialized first if `A` is defined before `B` in every TU where `B` is defined. And a non-inline variable `C` can rely on a prior inline variable `B` being initialized first if `B` is defined before `C`.

There is no comparable guarantee for instantiated variables, in part because an intended implementation strategy is to instantiate them separately, and combine those instantiation units with the compilation units at link time in an arbitrary order, and in part because instantiation order could in general be different in different TUs.

> Second, LLVM doesn't guarantee that global_ctors at the same priority execute in order. See the langref: https://llvm.org/docs/LangRef.html#the-llvm-global-ctors-global-variable So, without a guarantee from LLVM, Clang can't rely on this behavior. LLVM relies on this lack of an ordering guarantee to power globalopt.

It doesn't seem reasonable for us to provide a guarantee here, and in fact, we //can't// provide a guarantee in general (once there's more than one TU, a consistent global total order might not exist). I think the question is, should we make some minor effort to make the broken code that's relying on initialization order of unordered variables happen to do the right thing most of the time, or should we keep the status quo that is likely to result in problems happening earlier? I don't think it's clear whether the superficial increase in compatibility with GCC is a net positive or negative -- giving the surprising initialization order may counter-intuitively help people to find ordering bugs faster.



================
Comment at: clang/lib/CodeGen/CGDeclCXX.cpp:582
 
-    AddGlobalCtor(Fn, 65535, COMDATKey);
+    AddGlobalCtor(Fn, 65535, COMDATKey, IsInstantiation);
     if (COMDATKey && (getTriple().isOSBinFormatELF() ||
----------------
This is effectively initializing instantiated variables in reverse instantiation order. That seems like it'll make things worse as much as it makes things better. For example, given:

```
#include <iostream>
template<typename T> int a = (std::cout << "hello, ", 0);
template<typename T> int b = (std::cout << "world", 0);
int main() {
  (void)a<int>;
  (void)b<int>;
}
```

... we currently print `"hello, world"`, but with this change we'll print `"worldhello, "`.

If we want a sensible initialization order, I think we need a different strategy, that will probably require `Sema` to be a lot more careful about what order it instantiates variables in and what order it passes them to the AST consumer: if an instantiation A triggers another instantiation B, we should defer passing A to the consumer until B has been instantiated and passed to the consumer. That's probably not too hard to implement, by adding an entry to the pending instantiation list to say "now pass this to the consumer" in the case where one instantiation triggers another. But I do wonder whether that level of complexity is worthwhile, given that code relying on this behavior is broken.


================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1562-1563
   // FIXME: Type coercion of void()* types.
-  GlobalCtors.push_back(Structor(Priority, Ctor, AssociatedData));
+  if (InsertFront)
+    GlobalCtors.emplace(GlobalCtors.begin(), Priority, Ctor, AssociatedData);
+  else
----------------
This is quadratic in the number of instantiated variables. I don't think that's acceptable.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126341/new/

https://reviews.llvm.org/D126341



More information about the cfe-commits mailing list