[PATCH] D137909: [Support] Allow complex names for annotation points and ranges via ${}

Thu Nov 17 05:58:42 PST 2022

sammccall added a comment.

Thanks! sorry about continuing to drift this patch, but this looks like a great improvement.

Critical comments below are:

- dropping `Optional` and just use StringRef
- using one consistent way to to include payloads across the API. I think `pair<T, StringRef>` is actually best overall

Others are less important/optional if you want to get this wrapped up sooner :-)

================
Comment at: clang-tools-extra/clangd/unittests/tweaks/TweakTesting.h:119
     for (const auto &P : A.points())                                           \
-      EXPECT_EQ(Available, isAvailable(AST, {P, P})) << decorate(A.code(), P); \
+      EXPECT_EQ(Available, isAvailable(AST, {P, P, llvm::None}))               \
+          << decorate(A.code(), P);                                            \
----------------
not being able to ignore payloads (e.g. not assert the presence/absence) seems like a regression to me, this points further towards separating ranges() and rangesWithPayload

================
Comment at: llvm/include/llvm/Testing/Support/Annotations.h:58
   /// represents a half-open range.
   struct Range {
     size_t Begin = 0;
----------------
this patch takes too many different approaches to attaching the payloads to the points:
 - adding it to an existing struct that is already returned
 - defining a new wrapping struct for the andPayload version
 - defining a new inheriting struct for the andPayload version

I think we should keep it simple, and use `pair<T, StringRef>` for all the `andPayload` versions. This isn't optimal in isolation, but it keeps things coherent, avoiding:
 - confusion around clever type-system features (inheritance, implicit conversions, smart-pointer like types, templates)
 - introducing new structs whose names we have to remember
 - inconsistency between how payloads are accessed between ranges/points/clangd/llvm versions

In many cases structured bindings make using the `pair` versions less cumbersome than they would have been in the past.

================
Comment at: llvm/include/llvm/Testing/Support/Annotations.h:108
+  const llvm::StringMap<llvm::SmallVector<Point, 1>> &
+  all_points_and_payloads() const;

----------------
I hadn't seen these context on these `all_` functions, I reached out to the authors to find out about them (they're used out-of-tree).

Basically the use case is to use llvm::Annotations to do the parsing, but dump out the data into some external test helper which lookups/assertions will run on.
So instead of adding extra variants, we just need to expose the parse results in some convenient way.

================
Comment at: llvm/include/llvm/Testing/Support/Annotations.h:123
   std::string Code;
-  llvm::StringMap<llvm::SmallVector<size_t, 1>> Points;
+  llvm::StringMap<llvm::SmallVector<Point, 1>> Points;
   llvm::StringMap<llvm::SmallVector<Range, 1>> Ranges;
----------------
Internally I'd consider switching to a uniform representation of points and ranges. For two reasons:
 - we're adding a new feature that's common between them
 - all_points/all_ranges want to expose such a representation

Something like:
```
struct Annotation {
  size_t Start;
  size_t End; // or size_t(-1) for a point
  StringRef Name;
  StringRef Payload;
}
vector<Annotation> All;
StringMap<SmallVector<unsigned>> Points; // values are indices into All
StringMap<SmallVector<unsigned>> Ranges; // values are indices into All

ArrayRef<Annotation> all() { return All; } // replaces all_points and all_ranges
```

up to you though

================
Comment at: llvm/include/llvm/Testing/Support/Annotations.h:61
     size_t End = 0;
+    llvm::Optional<llvm::StringRef> Payload;

----------------
tom-anders wrote:
> One could argue that this should just be `llvm::StringRef`. Right now, `$name^` results in `llvm::None` for the payload, while `$name()^` will yield an empty string.
> 
> Do you think they should just both be empty strings instead?
Yes, I don't think that distinguishing between "no payload" and "empty payload" is useful, so would prefer the simpler API.

================
Comment at: llvm/lib/Testing/Support/Annotations.cpp:30
   };
   llvm::Optional<llvm::StringRef> Name;
+  llvm::Optional<llvm::StringRef> Payload;
----------------
(If using the `Annotation` struct I mentioned before, an instance of it could cover Name+Payload and it could also be the elements of OpenRanges)

================
Comment at: llvm/unittests/Support/AnnotationsTest.cpp:120
+TEST(AnnotationsTest, Payload) {
+  // A single unnamed point or range with empty payload
+  EXPECT_EQ(llvm::Annotations("a$^b").pointWithPayload(),
----------------
this is no payload specified, we should also have one test where it's specified but empty e.g. `$()^` (even if we make this equivalent to unspecified)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137909/new/

https://reviews.llvm.org/D137909