[PATCH] D87808: [DebugInfo] Fix bug in constructor homing where it would use ctor homing when a class only has copy/move constructors

David Blaikie via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Sep 22 10:14:20 PDT 2020


dblaikie added a comment.

In D87808#2286664 <https://reviews.llvm.org/D87808#2286664>, @rsmith wrote:

> In D87808#2280197 <https://reviews.llvm.org/D87808#2280197>, @dblaikie wrote:
>
>> @rsmith What's the deal with these anonymous structs/unions? Why do they have copy/move constructors (are those technically called from the enclosing class's copy/move constructors?) but no default constructor to be called from the other ctors of the enclosing class?
>
> Let's use the class `G` in the example. `G::G` directly initializes the field `g_`, so it can't call any kind of constructor for the anonymous struct member. Instead, anonymous structs and unions are "expanded" when building the constructor initializer lists for all cases other than a copy or move constructor (no initialization action is taken for union members by default, though -- not unless they have a default member initializer or an explicit mem-initializer). So the default constructor of an anonymous struct or union is never really used for anything[*].
>
> But now consider the copy or move constructor for a type like this:
>
>   struct X {
>     Y y;
>     union { ... };
>   };
>
> That copy or move constructor needs to build a member initializer list that (a) calls the `Y` copy/move constructor for `y`, and (b) performs a bytewise copy for the anonymous union. We can't express this in the "expanded" form, because we wouldn't know which union member to initialize.
>
> So for the non-copy/move case, we build `CXXCtorInitializers` for expanded members, and in the copy/move case, we build `CXXCtorInitializers` calling the copy/move constructors for the anonymous structs/unions themselves.

OK - I /mostly/ follow all that.

> [*]: It's not entirely true that the default constructor is never used for anything. When we look up special members for the anonymous struct / union (looking for the copy or move constructor) we can trigger the implicit declaration of the default constructor. And it's actually even possible to *call* that default constructor, if you're tricksy enough: https://godbolt.org/z/Tq56bz

Good to know! Makes things more interesting. (any case where something could get constructed without calling the constructor is something this feature needs to be aware of/careful about)

> In D87808#2282223 <https://reviews.llvm.org/D87808#2282223>, @rnk wrote:
>
>> Maybe the issue is that this code is running into the lazy implicit special member declaration optimization. Maybe the class in question has an implicit, trivial, default constructor, but we there is no CXXConstructorDecl present in the ctors list for the loop to find.
>
> That seems very likely. The existing check:
>
>   if (Ctor->isTrivial() && !Ctor->isCopyOrMoveConstructor())
>     return false;
>
> is skating on thin ice in this regard: a class with an implicit default constructor might or might not have had that default constructor implicitly declared.

Yeah, that's subtle & probably best not to rely on being able to gloss over that case.

> But I think this code should give the same outcome either way, because a class with any constructor other than a default/copy/move constructor must have a user-declared constructor of some kind, and so will never have an implicit default constructor.

Hmm, trying to parse this. So if we're in the subtle case of having an implicit default constructor (and no other (necessarily user-declared, as you say) constructors) - if it's not been declared, then this function will return false. If it has been declared, it'll return false... hmm, nope, then it'll return true.

It sounds like there's an assumption related to "a class with any constructor other than a default/copy/move" - a default constructor would be nice to use as a constructor home. (certainly a user-defined one, but even an implicit one - so long as it gets IRGen'd when called, and not skipped (as in the anonymous struct/class case) or otherwise frontend-optimized away)

> (Inherited constructors introduce some complications here, but we don't do lazy constructor declaration for classes with inherited constructors, so I think that's OK too.)

Ah, handy!



================
Comment at: clang/lib/CodeGen/CGDebugInfo.cpp:2292-2300
+  bool hasCtor = false;
+  for (const auto *Ctor : RD->ctors()) {
     if (Ctor->isTrivial() && !Ctor->isCopyOrMoveConstructor())
       return false;
+    if (!Ctor->isCopyOrMoveConstructor())
+      hasCtor = true;
+  }
----------------
rsmith wrote:
> This looks pretty similar to:
> 
> ```
> return RD->hasUserDeclaredConstructor() && !RD->hasTrivialDefaultConstructor();
> ```
> 
> (other than its treatment of user-declared copy or move constructors), but I have to admit I don't really understand what the "at least one constructor and no trivial or constexpr constructors" rule aims to achieve, so it's hard to know if this is the right interpretation. The rule as written in the comment above is presumably not exactly right -- all classes have at least one constructor, and we're not excluding classes with trivial copy or move constructors, only those with trivial default constructors.
> 
> I wonder if the intent would be better captured by "at least one non-inline constructor" (that is, assuming all declared functions are defined, there is at least one translation unit in which a constructor is defined that can be used as a home for the class info).
So the general goal is to detect any type where the construction of that type somewhere must invoke a constructor that will be IR-generated.

Move and copy constructors are ignored because the assumption is they must be moving/copying from some other object, which must've been constructed, ultimately, by a non-move/copy constructor.

Ideally this would be usable even for inline ctors - even if the ctor calls get optimized away later[^1], they'd still allow us to reduce the number of places the type is emitted to only those places that call the ctor.



[^1] actually, the way that should work doesn't seem to be working right now (eg:
type.cpp
```
struct t1 { t1() { } };
void f1(void*);
int main() {
  f1(new t1());
}
```
type2.cpp
```
struct t1 { t1() { } };
void f1(void* v) {
  delete (t1*)v;
}
```
build: `clang++ type.cpp -g -Xclang -fuse-ctor-homing type2.cpp && llvm-dwarfdump a.out`
-> definition of "t1" in the DWARF
build with optimizations: `clang++ -O3 type.cpp -g -Xclang -fuse-ctor-homing type2.cpp && llvm-dwarfdump a.out`
-> missing definition of "t1"
`type.cpp` is chosen as a home for `t1` because it calls a user-defined ctor, but then that ctor gets optimized away and there's no other mention of `t1` in `type.cpp` so the type is dropped entirely. This could happen even with a non-inline definition - under LTO the ctor could get optimized away (though this would be safe in FullLTO - the other references to `t1` would be made to refer to the definition and keep it alive - but in ThinLTO the TU defining the ctor might be imported and nothing else - leaving the type unused/dropped)
To fix this we should put 'homed' types in the retained types list so they are preserved even if all other code/references to the type are dropped. I think I implemented this homed type pinning for explicit template specialization definitions, because they have no code attachment point, so similar logic could be used for ctor homing. (vtable homing /might/ benefit from this with aggressive/whole program devirtualization? Not sure - harder to actually optimize away all the references to a type, but possible maybe?)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87808/new/

https://reviews.llvm.org/D87808



More information about the cfe-commits mailing list