[PATCH] D81868: [libTooling] Add parser for string representation of `RangeSelector`.

Thu Jun 18 10:19:18 PDT 2020

gribozavr2 accepted this revision.
gribozavr2 added inline comments.
This revision is now accepted and ready to land.

================
Comment at: clang/lib/Tooling/Transformer/Parsing.cpp:169
+
+// For consistency with matcher parser and C++ code, node id's are written as
+// strings. However, we do not accept support escaping in the string.
----------------
s/matcher/AST matcher/

================
Comment at: clang/lib/Tooling/Transformer/Parsing.cpp:170
+// For consistency with matcher parser and C++ code, node id's are written as
+// strings. However, we do not accept support escaping in the string.
+static ExpectedProgress<std::string> parseStringId(ParseState State) {
----------------
s/accept//

================
Comment at: clang/lib/Tooling/Transformer/Parsing.cpp:29
+
+namespace {
+using llvm::Error;
----------------
ymandel wrote:
> gribozavr2 wrote:
> > I'm a bit concerned about the abundance of parsers in Clang tooling: we already have a similar language for dynamic AST matchers. I'm concerned about both code duplication as such, and that there are multiple ways to do something.
> > 
> > Code duplication makes it more difficult to carry out horizontal efforts like improving error reporting.
> > 
> > Multiple ways to do something makes codebase knowledge less reusable. It might also create language discrepancies that users might notice (for example, I don't remember if `bind(id)` works in dynamic AST matchers or not; we would be better off if range selectors were consistent with that).
> > 
> > I don't think this particular change decisively tips the scale towards refactoring the parser for dynamic AST matchers to be reusable here; however it is an option worth considering. I think we should be thinking about the total cost of ownership of this code.
> > 
> > Some future use cases will also need an embeddable language (like AST matchers for the syntax tree, or parsing stencils from strings).
> Agreed on these points. We'd like to move ahead with this patch, but have made the following changes:
> 1. Added a FIXME in the implementation file indicating intent to merge with AST matcher parser code
> 2. Modified the accepted language for consistency with AST matchers (and C++ code). Node ids must be surrounded by quotes.
> 
> One possible complication is that the stencil format-string parser (forthcoming) will not easily be merged into a lexer-based parser, given that it allows free text for the format string (only the special escape sequences have specified structure). So, we will need to find a way to operate both scannerless and not if we want a unified parser infrastructure. However, we can solve that problem when we come to it.
> We'd like to move ahead with this patch, but have made the following changes

SGTM.

> the stencil format-string parser (forthcoming) will not easily be merged into a lexer-based parser

I'd have to take a closer look to provide a meaningful response, but postponing this discussion to the patch that will be adding this code SGTM.

================
Comment at: clang/unittests/Tooling/RangeSelectorTest.cpp:271
+  ASSERT_THAT_EXPECTED(R, llvm::Succeeded());
+  EXPECT_THAT_EXPECTED(select(*R, Match), HasValue("3;"));
 }
----------------
ymandel wrote:
> gribozavr2 wrote:
> > I wonder if we could figure out a more direct testing strategy for a parser (that does not necessarily involve using the parsed objects) if we had more advanced parsing infrastructure.
> Indeed. That would certainly be preferable. `Stencil`s support the `toString` operator for just such testing support.  A generic parse tree would also allow direct testing, but then you still need to find a way to test the conversion from parse-tree to stencil. So, ultimately, you want to have some reflection on the type itself.
> 
> I propose that we move `RangeSelector` to implement `MatchComputation` rather than `MatchConsumer` to support `toString` and better tests. WDYT?
> I propose that we move RangeSelector to implement MatchComputation rather than MatchConsumer to support toString and better tests. WDYT?

+1, makes sense. Initially they were separate abstractions because we thought we didn't need the complexity everywhere, but turns out we do.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81868/new/

https://reviews.llvm.org/D81868