[PATCH] D90161: [SyntaxTree] Provide iterators for Lists

Tue Oct 27 09:47:27 PDT 2020

sammccall added inline comments.

================
Comment at: clang/lib/Tooling/Syntax/Tree.cpp:352
+  default:
+    llvm_unreachable("List children can only be elements or delimiters.");
   }
----------------
eduucaldas wrote:
> sammccall wrote:
> > Let's avoid deliberately crashing on invalid code.
> > 
> > I'd suggest either treating over non-element-non-delimiters as if they don't exist, or treating them as elements by default.
> We thought about invalid code.
> This `llvm_unreachable` states an invariant on List, that any children can only have one of the two roles.
> 
> This would imply that a tree builder - which for invalid code would be the pseudo parser -  would assign `ListDelimiter` for everything that matches the expected delimiter, and try to parse other things into the expected `ElementType`.
> 
> Let's examine an example. In:
> `f(1, +, 2)`
> We have the `List` `CallArguments` for which `ElementType` = `Expression`.
> We have some options:
> - `+` is a Leaf with role `Unknown`.(which breaks the current invariant)
> - `+` is part of an `Expression`, more specifically a `BinaryOperatorExpression` with `lhs` and `rhs` missing.
> 
> At the same time we could think of another example:
> `f(1, int, 2)`
> What would we do in this case?
> 
> In conclusion, I think this invariant is a bit constraining. We can consider relaxing it to allow `Unknown` roles. I think in general it would be good for us to have a common understanding of the plans for the pseudo parser.
> We thought about invalid code.
> This `llvm_unreachable` states an invariant on List, that any children can only have one of the two roles.

I think this is a bigger topic we should separate from this code review (though it ties into the template question). Maybe we can meet later in the week?

There's a tension between the generic and typed interfaces when it comes to modification: the typed interfaces benefit from and can enforce strong semantic invariants, where generic interfaces cannot.
It looks like the current approach is "invariants are strong, generic mutation APIs don't exist".
I'm not sure this is the right long-term direction:
 - many candidate invariants are incompatible with rich representation of broken code, and pseudoparsing, it's going to make the model very hard to understand
 - it requires all refactorings to be built on "blessed" in-tree primitives that preserve invariants. This does not scale well, and does not give out-of-tree users the flexibility they need to solve their problems - many are stuck until the next LLVM release.
 - generic/uniform access is empirically useful for some problems, and making access read-only is a large restriction on that option. (Examples: pseudoparsing, de/serialization)
 - breaking invariants is sometimes the right thing to do. Consider a user dragging code around in an IDE - we'd adjust the selection (e.g. snap to complete nodes) before moving it, but we can't just reject the edit due to syntax.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90161/new/

https://reviews.llvm.org/D90161