[PATCH] D80981: [AST] Fix a crash on accessing a class without definition in constexpr function context.

Fri Jun 19 05:21:17 PDT 2020

hokein added inline comments.

================
Comment at: clang/lib/AST/ExprConstant.cpp:4320
+    if (!RD->hasDefinition())
+      return APValue();
     APValue Struct(APValue::UninitStruct(), RD->getNumBases(),
----------------
rsmith wrote:
> hokein wrote:
> > rsmith wrote:
> > > hokein wrote:
> > > > sammccall wrote:
> > > > > This doesn't look all that safe - you're using a `None` value to indicate failure, but no current code path does that and none of the callers seem to check for failure.
> > > > > (e.g. `evaluateVarDecl` returns true instead of false).
> > > > > Presumably we're going to get a diagnostic somewhere (though it's not completely obvious to me) but can we be sure we won't assume this value has the right type somewhere down the line?
> > > > > 
> > > > > I get the feeling this is correct and I don't have enough context to understand why... how about you :-)
> > > > I don't have a promising explanation neither. 
> > > > 
> > > > I didn't find a better way to model failures in `getDefaultInitValue`. This function is used in multiple places of this file (and I'm not sure whether we should change it).
> > > > 
> > > > @rsmith any thoughts?
> > > `APValue()` is a valid representation for an object of class type, representing a class object that is outside its lifetime, so I think it's OK to use this representation, if we can be sure that this only happens along error paths. (It's not ideal, though.)
> > > 
> > > If we can't be sure this happens only along error paths, then we should produce a diagnostic here. Perhaps feed an `EvalInfo&` and a source location into every caller of this function and produce a diagnostic if we end up querying the default-initialized value of an incomplete-but-valid class type. Or perhaps we could check that the class is complete and valid from every caller of this function instead. (I think that we guarantee that, for a valid complete class type, all transitive subobjects are of valid complete types, so checking this only once at the top level before calling into `getDefaultInitValue` should be enough.)
> > Thanks for the suggestions.
> > 
> > oh, yeah, I missed that the `Foo` Decl is invalid, so checking the class decl is valid at every caller of `getDefaultInitValue` should work -- it would also fix other potential issues, looks like here we guarantee that the VarDecl is valid, but don't verify the decl which the VarDecl's type refers to is valid in all callers.  
> > 
> > Given the fact that the `VarDecl` e is valid and class `Foo` Decl is invalid, another option to fix the crash is to invalidate this `VarDecl`. Should we invalidate the VarDecl if the type of the VarDecl refers to an invalid decl? My gut feeling is that it is worth keeping the VarDecl valid, so that more related AST nodes will be built (for better recovery and diagnostics), though it seems unsafe. 
> I think keeping the `VarDecl` valid is probably the better choice, to allow us to build downstream uses of it. Also, because variables can be redeclared, we could have something like `struct A; extern A v; struct A { invalid; };` -- and we can't reasonably retroactively mark `v` as invalid in this case, so we can't guarantee that the type of every valid variable is itself valid. (We *could* guarantee that the type of every valid variable *definition* is valid, but that will lead to inconsistencies where defining the variable causes later behavior of references to the variable to change.)
> 
> It's really unfortunate that we don't have a good definition of what "valid" means for a variable, or really any listing of what invariants we maintain in the AST in the presence of invalid nodes. :( This is one of the things I would work on if I had time...
> I think keeping the VarDecl valid is probably the better choice, to allow us to build downstream uses of it. Also, because variables can be redeclared, we could have something like struct A; extern A v; struct A { invalid; }; -- and we can't reasonably retroactively mark v as invalid in this case, so we can't guarantee that the type of every valid variable is itself valid. (We *could* guarantee that the type of every valid variable *definition* is valid, but that will lead to inconsistencies where defining the variable causes later behavior of references to the variable to change.

yeah, that makes sense, thanks for the explanation.

I have updated the patch -- now the `getDefaultInitValue()` does error check. If fails, return `APValue()` which will only happen on error paths. Since it changes non-trivial amount of code, would be nice if you can take a look.

> It's really unfortunate that we don't have a good definition of what "valid" means for a variable, or really any listing of what invariants we maintain in the AST in the presence of invalid nodes. :( This is one of the things I would work on if I had time...

that would be nice to have, and given that we have containsErrors, the meanings of them are subtle (sometimes I got confused by these concepts). Would you like me to help here? happy to help though I don't fully understand clang yet.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80981/new/

https://reviews.llvm.org/D80981