[PATCH] D61722: [AST] Add RecoveryExpr; produce it on overload resolution failure and missing member.

Mon May 20 00:17:10 PDT 2019

rsmith added a comment.

In D61722#1504376 <https://reviews.llvm.org/D61722#1504376>, @sammccall wrote:

> In D61722#1503863 <https://reviews.llvm.org/D61722#1503863>, @rsmith wrote:
>
> > I expect we'll want a `ContainsErrors` bit on `Stmt`, propagated similarly to the existing `InstantiationDependent` / `ContainsUnexpandedParameterPack` / ... bits.
>
>
> This sounds good (as an alternative to using dependent bits for this).
>
> Do you think we want a similar bit on `Type` (set on e.g. `vector<ErrorTy>`) , or should be ensure `ErrorTy` never gets used for compound types?

The simplest thing to do might be to ensure that `vector<ErrorTy>` is itself `ErrorTy`. There might be some cases where we can get better recovery by keeping a more precise type (eg, maybe we can infer something from an operation on `ErrorTy*` that we couldn't infer from merely `ErrorTy`), but I suspect the added complexity isn't worthwhile.

>> For example, the constant evaluator will want to quickly bail out of evaluating expressions and statements containing errors, and the error recovery `TreeTransform` that we perform to fix typos will want that too (and maybe could be able to fix other kinds of errors for which we build these error nodes eventually?).
> 
> Definitely possible, I don't know whether it's worth it or not:
> 
> 1. errors that can be recovered eagerly: They are, and `RecoveryExpr` or equivalent never kicks in here. This seems ideal.
> 2. errors that can never be recovered: returned as `RecoveryExpr` or equivalent. This is a difference from current TypoExpr which never does this.
> 3. errors that must be recovered lazily: we could e.g. add TypoExpr-style recovery callbacks to `RecoveryExpr` and merge the two concepts. I don't have a clear idea of how large/important this category is compared to 1 though.

Well, typo correction would be in case 1: we can recover from typos eagerly, but we choose to defer them because we might be able to figure out a better recovery based on looking at context that we don't have yet. The same is probably true for a lot of our error recovery, but it's probably not worth it most of the time. So I think it's more of a question of balancing when it's worthwhile to save enough state to recover.

What kinds of cases are you thinking of that would be in your second category? The examples you give for `RecoveryExpr` (a call that can't be resolved and a member access that can't be resolved) are *already* in case 3 when the reason they can't be resolved is due to a typo within them. We model that today by treating them as dependent in the initial parse, but that's problematic and we will want to use whatever replacement mechanism we come up with here instead.

Looking at your current uses for `RecoveryExpr`, I'm not sure whether adding a new AST node is the right model:

> a function call where overload resolution failed (most typically: wrong args) Callee is captured as an UnresolvedLookupExpr, and args are captured. Type is available if the "best" candidates have the same return type.

I would assume that we would represent a function call for which the callee or an argument contains an error as a `CallExpr` with the "contains error" flag set and with `ErrorTy` as its type. Could we use the same representation here? (That is, represent this case as a `CallExpr` whose type is `ErrorTy` with the "contains error" flag set, with no `RecoveryExpr`.)

> access of a nonexistent member Base expression is captured, type is never available.

Using `UnresolvedMemberExpr` for this case might make sense. I think we should think about how we'd represent a member access that can't be fully analyzed because the object expression contains errors, and think about whether it makes sense to use that same representation for the case where the error is immediately in forming the member access itself. (And similarly across all other kinds of recovery expression we might create.)

If we want to enable (eg) AST matchers to operate on erroneous AST, using a homogeneous `RecoveryExpr` for all different syntaxes that originate errors seems like it would introduce complexity, especially if all consumers of the AST already need to deal with the case that some existing AST node is marked as being erroneous so don't get any benefit from having a concrete `RecoveryExpr` as a known stopping point. (If you get there, you've probably already gone too far.)

Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61722/new/

https://reviews.llvm.org/D61722