[PATCH] D74361: [Clang] Undef attribute for global variables

Jon Chesterfield via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Mar 2 13:19:15 PST 2020


JonChesterfield marked 2 inline comments as done.
JonChesterfield added inline comments.


================
Comment at: clang/test/CodeGenCXX/attr-loader-uninitialized.cpp:23
+// CHECK: @nominally_value_init = global i32 undef
+int nominally_value_init  [[clang::loader_uninitialized]] = 4;
+
----------------
Quuxplusone wrote:
> rjmccall wrote:
> > JonChesterfield wrote:
> > > Quuxplusone wrote:
> > > > This test case is identical to line 36 of clang/test/Sema/attr-loader-uninitialized.cpp, where you say you don't want it to compile at all.
> > > > 
> > > > I think you need a clearer idea of how this interacts with initializers. Is it merely supposed to eliminate the //zero-initialization// that happens before the user-specified construction/initialization, or is it supposed to compete with the user-specified construction/initialization?
> > > > 
> > > > That is, for
> > > > 
> > > >     nontrivial unt [[clang::loader_uninitialized]];
> > > > 
> > > > is it merely supposed to call `unt::unt()` on a chunk of undef memory (instead of the usual chunk of zeroed memory), or is it supposed to skip the constructor entirely? And for
> > > > 
> > > >     int x [[clang::loader_uninitialized]] = foo();
> > > > 
> > > > is it merely supposed to call `foo()` and assign the result to a chunk of undef memory (instead of the usual chunk of zeroed memory), or is it supposed to skip the initialization entirely?
> > > I think you commented while the first working piece of sema landed. My thinking is relatively clear but my understanding of clang's semantic analysis is a work in progress!
> > > 
> > > Initializers (`= foo()`) are straightforward. Error on the basis that the attribute effectively means `= undef`, and one should not have two initializers. A test case is now added for that (and now passes).
> > > 
> > > The codegen I want for a default constructed global marked with this variable is:
> > > - global variable allocated, with undef as the original value
> > > - default constructor call synthesized
> > > - said default constructor set up for invocation from crt, before main, writing over the undef value
> > > 
> > > Where the default constructor can be optimized as usual, e.g. if it always writes a constant, we can init with that constant instead of the undef and elide the constructor.
> > > 
> > > I don't have that actually working yet - the constructor call is not being emitted, so we just have the undef global.
> > > 
> > > I think it's important to distinguish between the values of the bits when the program is loaded and whether constructor/destructors are called, as one could want any combination of the two.
> > I think Arthur is suggesting that it would be useful to allow the attribute to be used in conjunction with an initializer in C++, since if the initializer has to be run dynamically, we can still meaningfully suppress the static zero-initialization.   That is, we've accepted that it's useful to do this when *default-initializing* a global, but it's actually useful when doing *any* kind of dynamic initialization.
> > 
> > Maybe we can leave it as an error in C++ when the initializer is a constant expression.  Although that might be unnecessarily brittle if e.g. the constructor is `constexpr` in one library version but not another.
> No, that's exctly what I mean. You seem to be holding two contradictory ideas simultaneously:
> 
>     [[loader_uninitialized]] X x = X{};  // two initializers, therefore error
> 
>     [[loader_uninitialized]] X x {}; // one initializer plus one constructor, therefore fine
> 
> In C++, these two declarations have identical semantics. It doesn't make sense to say that one of them "calls a constructor" and the other one "has an initializer." They're literally the same thing.
> 
> Similarly in both C99 and C++ with plain old ints:
> 
>     [[loader_uninitialized]] int x = foo();
> 
> This means "call foo and dynamically initialize x with the result," just as surely as
> 
>     [[loader_uninitialized]] X x = X();
> 
> means "call X::X and dynamically initialize x with the result." Having one rule for dynamic initializers of primitive types and a separate rule for dynamic initializers of class types doesn't work.
> 
> Furthermore, "dynamic initialization" can be promoted to compile-time:
> 
>     [[loader_uninitialized]] int x = 42;
>     [[loader_uninitialized]] std::string_view x = "hello world";
> 
> It wouldn't make semantic sense to say that one of these has "two initializers" and the other has "one initializer," because both of the initializations end up happening at compile time and getting put into .data.
I'm under the impression that the constructs are:
```
X x = X{};  // default construct an X and then call the copy constructor at x
X x {}; // default construct an X at the location of x
X x; // default construct an X at the location of x
```

The C++ initialisation rules are very complicated so I won't be shocked to have got that wrong. 

If your toolchain actually runs global constructors then there isn't much use to marking C++ objects with this attribute, unless said constructor does nothing and would thus leave the bytes still undef.

Accepting default constructors (or, perhaps only accepting trivial types?) would allow people to use this attribute with the approximation of C++ available on systems that don't actually run the global constructors.

Equally, if this seems too complicated, I'm happy to restrict this to C as that's still an improvement on the asm and/or linker scripts I have available today.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74361/new/

https://reviews.llvm.org/D74361





More information about the cfe-commits mailing list