[PATCH] D62375: [ASTImporter] Mark erroneous nodes in from ctx

Mon Jun 24 02:54:58 PDT 2019

martong added a comment.

Alexei,

I think I still owe you some explanation about this patch.
I do consider this patch as one of the most intricate patches regarding ASTImporter.
I'd like to answer the following questions in this comment:
What is an ImportPath and why do we need to track it?
What does it mean if we have a cycle in the import path?
How do we use these cycles during the error handling?

An ImportPath is the list of the AST nodes which we visit during an Import call.
If node `A` depends on node `B` then the path contains an `A`->`B` edge.
>From the call stack of the import functions we can read the very same path.

Now imagine the following AST, where the `->` represents dependency in therms of the import.

  A->B->C->D
     `->E

We would like to import A.
The import behaves like a DFS, so we will visit the nodes in this order: ABCDE.
During the visitation we will have the following ImportPaths:

  A
  AB
  ABC
  ABCD
  ABC
  AB
  ABE
  AB
  A

If during the visit of E there is an error then we set an error for E, then as the call stack shrinks for B, then for A:

  A
  AB
  ABC
  ABCD
  ABC
  AB
  ABE // Error! Set an error to E
  AB  // Set an error to B
  A   // Set an error to A

However, during the import we could import C and D without any error and they are independent from A,B and E.
We must not set up an error for C and D.
So, at the end of the import we have an entry in `ImportDeclErrors` for A,B,E but not for C,D.

Now what happens if there is a cycle in the import path?
Let's consider this AST:

  A->B->C->A
     `->E

During the visitation we will have the below ImportPaths and if during the visit of E there is an error then we will set up an error for E,B,A.
But what's up with C?

  A
  AB
  ABC
  ABCA
  ABC
  AB
  ABE // Error! Set an error to E
  AB  // Set an error to B
  A   // Set an error to A

This time we know that both B and C are dependent on A.
This means we must set up an error for C too.
As the call stack reverses back we get to A and we must set up an error to all nodes which depend on A (this includes C).
But C is no longer on the import path, it just had been previously.
Such situation can happen only if during the visitation we had a cycle.
If we didn't have any cycle, then the normal way of passing an Error object through the call stack could handle the situation.
This is why we must track cycles during the import process for each visited declaration.

I hope this explanation makes it even cleaner than the tests and thanks again for the review!

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62375/new/

https://reviews.llvm.org/D62375