[PATCH] D45753: Lift JSON library from clang-tools-extra/clangd to llvm/Support.

Sam McCall via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 15 09:26:00 PDT 2018


sammccall added a comment.

(Looks like Phabricator swallowed most of my email reply, please excuse some repetition)

In https://reviews.llvm.org/D45753#1094739, @chandlerc wrote:

> Sorry for delays circling back to this.
>
> I think the primary concerns about putting this into the Support library is that we already have one API that is quite similar there: YAMLIO. However, that API is clearly not serving all of the needs of users given that there are mutliple JSON-parsing code paths in the wider project that are already using other libraries (this one, but also Polly). So I think adding this makes lots of sense at this point.
>
> However, I'd like to take this opportunity to try to iterate a bit on the API since it is going into a new, more widely visible home and will almost certainly grow new users as soon as it lands. I'm fairly uncomfortable with the inheritance approach, and I think some of the additional APIs would benefit from (hopefully minor) adjustment. None of this is intended to change the fundamental design in any way though.


That sounds good. Any pushback below is to clarify the reasons for decisions, not to resist changes - happy to change whatever you don't find convincing.



================
Comment at: include/llvm/Support/JSON.h:62
+/// The string keys may be owned or references.
+class Object : public std::map<ObjectKey, Value> {
+public:
----------------
chandlerc wrote:
> I understand the pragmatic reason for this, but I am pretty uncomfortable deriving from standard types like this. I'm worried it will hurt portability due to subtle variations in standard libraries, or subtle changes in standard libraries as we switch to C++N+1 or whatever.
> 
> I would be much more comfortable using something internal and owning the API you export.
> 
> Unrelatedly (and not blocking anything), I struggle to believe that std::map is the correct tool for the job... Is it just that DenseMap is a pain to use currently? Is it that these are typically small and not worth a DenseMap? I'm sorry to ask this as I suspect you've already explained this in the original review, but I must admit I'm curious.
Are we worried about inheriting the interface of *llvm* types, or just `std` types?

I've switched to inheriting from `DenseMap` here, but I can wrap it if you prefer.
(`std::map`'s ordered-ness made `operator==` and `print()` simple, so there's a few more lines now).

`Array` is more complicated: even `SmallVector<Value, 0>` needs `Value`, which sets up a cycle that's hard to break. So I've resorted to wrapping std::vector and exposing a hopefully-sane subset of the API.
(Aside: I suspect we *do* want to track e.g. the C++17 changes to `vector`, but that's a burden that shouldn't fall on whoever does the upgrade)


================
Comment at: include/llvm/Support/JSON.h:69-71
+  // Allow [] as if Value was default-constructible as null.
+  Value &operator[](const ObjectKey &K);
+  Value &operator[](ObjectKey &&K);
----------------
chandlerc wrote:
> This does more than what it says. It hides std::map's operator[] overloads, no matter what they are. Is that what you intended?
> 
> This is perhaps a good example of why I find inheritance challenging here...
There are already no such overloads available, as Value is not default-constructible. I reworded the comment a bit.

Agree that inheritance makes this a little subtle to reason about (though obviously if wrapping, this particular operator would look just the same).


================
Comment at: include/llvm/Support/JSON.h:100
+
+  // Typed accessors return None/nullptr if the element has the wrong type.
+  llvm::Optional<std::nullptr_t> getNull(size_t I) const;
----------------
chandlerc wrote:
> Do you expect users to primarily use these typed accessors? Or the underlying vector?
> 
> Should they take iterators instead of indices?
> 
> These seem to make the API somewhat hostile to range based for loops. I'm surprised this isn't more a facility provided by the iterator...
> 
> I find the name somewhat confusing too -- see my comment below for some clarity of why.
> 
> But what's the advantage here? Why not just `arr[i].asNull()`?
I expect the most common access patterns to be:
1. mapping to a `vector<Something>` using `fromJSON`
2. iterating over `Value`s using ranged-for (followed by `asFoo()` on each element)
3. these `getFoo(I)` accessors

They're not nearly as useful/important as the typed accessors on Object. The benefit is:
- symmetry with Object makes the API easier to follow (thus `size_t` rather than `iterator`)
- convenient when particular indices have semantics (like `executeCommand` parameter list in LSP), rather than being a homogeneous collection
- more readable than `operator[]` when calling on a pointer, which is common due to the `if (Array* A = V.asArray())` idiom - `(*A)[i].asNull()` vs `A->getNull(i)`.

> These seem to make the API somewhat hostile to range based for loops. I'm surprised this isn't more a facility provided by the iterator...

Can you elaborate on what kind of loop you'd like to write? I tend to find clever iterator APIs fairly  undiscoverable/obscure, but I'm not sure what I'm missing.


================
Comment at: include/llvm/Support/JSON.h:266
+
+  // Typed accessors return None/nullptr if the Value is not of this type.
+  llvm::Optional<std::nullptr_t> asNull() const {
----------------
chandlerc wrote:
> A more conventional name would be `getAsFoo` matching `getAs<T>`.
> 
> Also, is it really worth having all of these? `getAs<bool>` and `getAs<double>` seem just as nice as the non-templated versions to me... But I guess `getAsString` and `getAsInteger` do interesting validation and such.
> A more conventional name would be getAsFoo matching getAs<T>.
Sure, in Java ;-)
Adding "get" because a function must have a verb seems like cargo culting to me - we're trying to signal the effects, but there aren't any! Maybe it's my exposure, but I don't see any ambiguity as to what `asBoolean()` might do.

I'd like to propose a style guide change to be more like https://swift.org/documentation/api-design-guidelines/#strive-for-fluent-usage here. Meanwhile, I can change this to match the style guide as it stands if you think this is important. I do think it hurts the signal/noise.

> Also, is it really worth having all of these? getAs<bool> and getAs<double> seem just as nice as the non-templated versions to me... 

I do prefer the non-templated version:
 - `getAs<bool>` etc isn't fewer things to understand, and it's not shorter, it's just fewer distinct tokens. I'm not sure what we're trying to conserve here.
 - the non-templated functions are more discoverable: easier to read in the header, and work better with code completion, and easier to search for
 - `getAsNumber` is a better name than `getAs<double>`, as the relevant concept here is JSON's "Number" not the types used to represent them in C++. (this is "interesting validation and such" I think)
 - I don't want people to think they can write `getAs<int>()`, and if they do I want the error message to be easy to understand. Templates make both of these harder.



Repository:
  rL LLVM

https://reviews.llvm.org/D45753





More information about the llvm-commits mailing list