[PATCH] D86207: (De-)serialize BindingDecls before DecompositionDecl

Aaron Puchert via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Sep 2 17:01:39 PDT 2020


aaronpuchert added a comment.

I'll go with what @rsmith proposed to fix the bug. If the entire deserialization process doesn't rely on invariants, the order should indeed be irrelevant.

In D86207#2252557 <https://reviews.llvm.org/D86207#2252557>, @riccibruno wrote:

> I agree, but I think that the `BindingDecl` should still be considered to be under construction until the corresponding `DecompositionDecl` is constructed. Any use of the `BindingDecl` in this interval has to be mindful that the `BindingDecl` is not fully constructed.

That's probably advisable, they have a bit of a cyclic relationship.

> Is it possible to see a deserialized `BindingDecl` when  the corresponding `DecompositionDecl` has not been deserialized (which will as you say set `BindingDecl::Decomp`)? I have not been able to construct an example where this is the case but I am not at all confident that this is impossible.

Maybe it could be done with global variables: have an invalid global decomposition in the PCH, so the `BindingDecl`s have no `Binding` expression, then reference one of those bindings in the source file. The expression should be empty, so `readExpr` would end up deserializing the `DecompositionDecl`.

But I don't see how it could work with local variables, because we're either reading the entire function or no part of it, right?

> (`-ast-dump-all` can be added to force deserialization)

Thanks, that does it.



================
Comment at: clang/lib/Serialization/ASTReaderDecl.cpp:585
   D->setLocation(ThisDeclLoc);
   D->setInvalidDecl(Record.readInt());
   if (Record.readInt()) { // hasAttrs
----------------
rsmith wrote:
> The bug is here: we should not be calling `Decl::setInvalidDecl` here because it has invariants, and the `Decl` is incomplete at this point.
Didn't notice that we're a friend and can access private members—but it makes sense. That change fixes the bug as well.


================
Comment at: clang/lib/Serialization/ASTReaderDecl.cpp:1500
   for (unsigned I = 0; I != DD->NumBindings; ++I) {
     BDs[I] = readDeclAs<BindingDecl>();
     BDs[I]->setDecomposedDecl(DD);
----------------
rsmith wrote:
> `BindingDecl` might not be fully initialized here: if we enter deserialization to deserialize a `BindingDecl`, and then recurse into reading its binding expression, and then deserialize the `DecompositionDecl`, we can land here before we finish with the `BindingDecl`. If we called something that looked at the `Binding` expression on the `BindingDecl`, that'd be a problem.
> 
> But the general idea is that deserialization should never invoke AST functions with invariants (and generally should set state directly rather than using AST member functions in order to avoid accidentally calling a function with an invariant). So it shouldn't matter whether we deserialize the `DecompositionDecl` or the `BindingDecl`s first.
Right, we could end up reading a binding that is already being read higher up in the stack. It would have been visited already, but we wouldn't have the expression yet.


================
Comment at: clang/lib/Serialization/ASTReaderDecl.cpp:1509
   BD->Binding = Record.readExpr();
 }
 
----------------
rsmith wrote:
> Hm, presumably `BindingDecl::Decomp` gets set here because `readExpr` always reads an expression that involves the `DecompositionDecl`, which calls `setDecomposedDecl`? That seems ... very subtle. If that's the intended way for this to work, we should at least add a comment for that (or better, an assert that `Decomp` got set), and `BindingDecl::Decomp` should be a regular pointer not a `LazyDeclPtr`. (But even then, is this chain of reasoning guaranteed to hold even for invalid declarations? Maybe we should be serializing the `DecompositionDecl*` and setting `Decomp` here?)
Or the `BindingDecl` is read via the `DecompositionDecl` in the first place, but otherwise I think that's right. An assertion works neither before nor after this change: if the `DecompositionDecl` is read first, or contains more than one binding, it's created before some `BindingDecl`. Then `readExpr` will presumably just build a `DeclRefExpr` to the still unfinished decomposition that we're building higher in the stack, so we can't assume the back reference to be set.

For invalid declarations, I think we should still have a correspondence between decomposition and bindings, but the expression is obviously `nullptr` in some cases. So maybe if we only load a `BindingDecl` (would have to be a global variable then, right?) and not the decomposition, `Decomp` might stay uninitialized, but I'm not sure where to check that.

If we want to deserialize the `DecompositionDecl` here, we'd probably need to store declarations both ways? I think we'd still want to find all bindings for a decomposition that we're reading.

Regarding the `LazyDeclPtr` I think you're right—I don't see us initializing this with an offset anywhere. So using a plain pointer should be trivially NFC.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86207/new/

https://reviews.llvm.org/D86207



More information about the cfe-commits mailing list