[PATCH] D61722: [AST] Add RecoveryExpr; produce it on overload resolution failure and missing member.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu May 16 02:52:56 PDT 2019


sammccall marked 3 inline comments as done.
sammccall added a comment.

Thanks for the useful insights!

I'm going to work on a version of this patch that doesn't rely on type dependency and treats errors as a new concept instead.
I'm not convinced it's feasible to drop RecoveryExpr and reuse existing nodes, so I'll keep that at least for now. (I tried and failed, and don't have new ideas there - more detail below)

In D61722#1503863 <https://reviews.llvm.org/D61722#1503863>, @rsmith wrote:

> I expect we'll want a `ContainsErrors` bit on `Stmt`, propagated similarly to the existing `InstantiationDependent` / `ContainsUnexpandedParameterPack` / ... bits.


This sounds good (as an alternative to using dependent bits for this).

Do you think we want a similar bit on `Type` (set on e.g. `vector<ErrorTy>`) , or should be ensure `ErrorTy` never gets used for compound types?

> For example, the constant evaluator will want to quickly bail out of evaluating expressions and statements containing errors, and the error recovery `TreeTransform` that we perform to fix typos will want that too (and maybe could be able to fix other kinds of errors for which we build these error nodes eventually?).

Definitely possible, I don't know whether it's worth it or not:

1. errors that can be recovered eagerly: They are, and `RecoveryExpr` or equivalent never kicks in here. This seems ideal.
2. errors that can never be recovered: returned as `RecoveryExpr` or equivalent. This is a difference from current TypoExpr which never does this.
3. errors that must be recovered lazily: we could e.g. add TypoExpr-style recovery callbacks to `RecoveryExpr` and merge the two concepts. I don't have a clear idea of how large/important this category is compared to 1 though.



================
Comment at: include/clang/AST/BuiltinTypes.def:265
+// a template.
+BUILTIN_TYPE(Recovery, RecoveryTy)
+
----------------
rsmith wrote:
> erik.pilkington wrote:
> > Why are you creating a new type as opposed to just using DependentTy (sorta like TypoExpr does)? It seems like if you want to recycle all the dependence-propagating code in Sema, then you need to fall back to DependentTy anyways, i.e. `1 + <recovery-expr>` will have dependent type with this patch, right?
> > 
> > 
> Using `DependentTy` for `TypoExpr` is a mistake; it leads to all sorts of bad follow-on diagnostics incorrectly claiming that constructs have dependent types.
> 
> Rather than calling this `RecoveryTy`, I'd prefer to call it something a bit more general like `ErrorTy`, with the intent being that we eventually use it for `TypoExpr` too.
> Why are you creating a new type as opposed to just using DependentTy (sorta like TypoExpr does)?

Indeed it's probably redundant in this form of the patch. While I was still trying to get this to work in C mode, I was checking for RecoveryTy instead of isDependent() in various places. (Maybe that approach would always miss cases, as you suggest)

(Obviously if we *don't* reuse the concept of dependent types, then some new type/type concepts are needed.)


================
Comment at: include/clang/AST/Expr.h:5801-5809
+/// RecoveryExpr - a broken expression that we couldn't construct an AST for,
+/// e.g. because overload resolution failed.
+/// We capture:
+///  - the kind of expression that would have been produced
+///  - the valid subexpressions
+///  - the type, if known. If unknown, it is the built-in RecoveryTy, which
+///    is dependent (and so generally suppresses further diagnostics etc).
----------------
rsmith wrote:
> Do we need this at all, if we have a properly-propagated `ErrorTy` anyway? Instead of using this, it would seem that we could model an ill-formed expression as the corresponding AST node but with its type set to `ErrorTy`. Presumably the latter is what  we'll do when we find a subexpression containing an error, rather than creating a `RecoveryExpr` at every enclosing syntactic level, so AST clients will need to deal with that anyway.
> properly-propagated `ErrorTy`

An important part here (especially for code completion, but not only) is that there are RecoveryExprs where the real type is known. So `ErrorTy` doesn't mean "subexpression has errors".

The motivating case is:
```
string foo(param1, param2, param3, param4);
foo(param1, param2, param3);
```

Here `foo` has type `string`, not `ErrorTy`.

---

But we can certainly add a bit to Stmt for "subexpression has errors", so the question stands:

> as the corresponding AST node but [with the HasError bit set]. Presumably the latter is what we'll do when we find a subexpression containing an error, rather than creating a RecoveryExpr at every enclosing syntactic level, so AST clients will need to deal with that anyway.

Yes. I did try this and we debated it a lot. Somehow I forgot to mention this central design question in the patch description :-(

It certainly preserves more information/structure, and allows existing code to work with broken nodes.
However:
 - it breaks existing invariants, so a large fraction of consuming code needs to be modified but it's not obvious how. By comparison, most consumers are robust to adding a node type (or it'll break the compile in an obvious way with -Wswitch etc).
 - every time we want to add more recovery, we break new invariants. Whereas code that handles RecoveryExpr/ErrorTy can be pretty generic.
 - code tends to be partitioned by node type, so making errors a separate node type (mostly) isolates the complexity of error handling. Adding error conditions to many node types means this complexity is spread throughout consuming code.
 - it's particularly important that error-recovery code be "correct by construction" because test coverage of all error-recovery situations is very hard. Using the type system helps here.

In practice with this approach I wasn't able to fix the crashes either in clang or client code in any principled way, and I didn't have any confidence that the code was going to be correct even if I got the tests to pass.


================
Comment at: include/clang/AST/Type.h:2423-2424
+      : Type(Builtin, QualType(),
+             /*Dependent=*/(K == Dependent || K == Recovery),
+             /*InstantiationDependent=*/(K == Dependent || K == Recovery),
              /*VariablyModified=*/false,
----------------
rsmith wrote:
> Experience with `TypoExpr` suggests that treating non-dependent constructs as dependent is a mistake in the long term.
> Experience with TypoExpr suggests that treating non-dependent constructs as dependent is a mistake in the long term.

It's certainly a tradeoff. Reusing dependent types isn't quite accurate, but adding a separate concept complicates the model a lot I think. Could be that
 - TypoExpr should have used a separate concept
 - dependent types were the best option, the alternatives would have been even worse
 - TypoExpr never should have happened, as both alternatives are too bad

Certainly this is a chance to try the other approach.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61722/new/

https://reviews.llvm.org/D61722





More information about the cfe-commits mailing list