[PATCH] D45753: Lift JSON library from clang-tools-extra/clangd to llvm/Support.

Thu May 10 13:32:20 PDT 2018

Thanks for the thoughts, I'll take another pass and try to incorporate
them. But I'm out until Monday, so some answers/questions while this is in
your cache...

On Thu, May 10, 2018, 19:47 Chandler Carruth via Phabricator <
reviews at reviews.llvm.org> wrote:

> chandlerc added a comment.
>
> Sorry for delays circling back to this.
>
> I think the primary concerns about putting this into the Support library
> is that we already have one API that is quite similar there: YAMLIO.
> However, that API is clearly not serving all of the needs of users given
> that there are mutliple JSON-parsing code paths in the wider project that
> are already using other libraries (this one, but also Polly). So I think
> adding this makes lots of sense at this point.
>
> However, I'd like to take this opportunity to try to iterate a bit on the
> API since it is going into a new, more widely visible home and will almost
> certainly grow new users as soon as it lands. I'm fairly uncomfortable with
> the inheritance approach, and I think some of the additional APIs would
> benefit from (hopefully minor) adjustment. None of this is intended to
> change the fundamental design in any way though.
>
Thanks, this definitely makes sense.
Comments below are intended to explain the original motivation in case it
convinces you, not actually to resist changing things :-)

> ================
> Comment at: include/llvm/Support/JSON.h:62
> +/// The string keys may be owned or references.
> +class Object : public std::map<ObjectKey, Value> {
> +public:
> ----------------
> I understand the pragmatic reason for this, but I am pretty uncomfortable
> deriving from standard types like this. I'm worried it will hurt
> portability due to subtle variations in standard libraries, or subtle
> changes in standard libraries as we switch to C++N+1 or whatever.
>
> I would be much more comfortable using something internal and owning the
> API you export.
>
Understood. I'm concerned about trying to sensibly subset the (huge)
container APIs and about losing the explicit "is-a" which
documents/enforces intent here. But future proofing is a concern.

Is the problem more API-by-inheritance or std specifically? e.g. WDYT about
inheriting from DenseMap<ObjectKey,Value> and SmallVector<Value, 0>?

>
> Unrelatedly (and not blocking anything), I struggle to believe that
> std::map is the correct tool for the job... Is it just that DenseMap is a
> pain to use currently? Is it that these are typically small and not worth a
> DenseMap? I'm sorry to ask this as I suspect you've already explained this
> in the original review, but I must admit I'm curious.
>
We wanted a *sorted* map, that's usually but not always small. I *think*
the only reason for sorted was to make serialization simpler, and we
certainly never profiled this. I'm unwilling to give up sorted output (huge
quality-of-life feature!) but it's only a few extra lines of code to sort
the DenseMap entries when writing, so I'll try that.

>
> ================
> Comment at: include/llvm/Support/JSON.h:69-71
> +  // Allow [] as if Value was default-constructible as null.
> +  Value &operator[](const ObjectKey &K);
> +  Value &operator[](ObjectKey &&K);
> ----------------
> This does more than what it says. It hides std::map's operator[]
> overloads, no matter what they are. Is that what you intended?

Yes - there aren't any, because Value isn't default-constructible and all
overloads require that as of C++17. C++20 could define some, but it's hard
to imagine what it could mean.

But understood this is subtle, and we may prefer to spend more code to
avoid subtleties that come with inheritance.

This is perhaps a good example of why I find inheritance challenging here...
>
>
> ================
> Comment at: include/llvm/Support/JSON.h:100
> +
> +  // Typed accessors return None/nullptr if the element has the wrong
> type.
> +  llvm::Optional<std::nullptr_t> getNull(size_t I) const;
> ----------------
> Do you expect users to primarily use these typed accessors? Or the
> underlying vector?

I expect the most common way to read arrays to be the fromJSON overload,
followed by a range based for loop over the contained Values.

These accessors are for when the value at index K has some known semantics.
This is common for Object and rare for Array (but not unknown, e.g. LSP
custom command arg list).

So these are mostly here for symmetry with Object::get*(), which are the
primary way to access properties and important for usability (they replace
three statements: find(); !=end(); as*()).

I can drop them. I think the symmetry makes the API a bit easier to
understand, but it's not terribly important.

Should they take iterators instead of indices?
>
This would break symmetry, and personally vector iterators usually feel
like needless obfuscation (I'm always converting them back to indices for
debugging)

These seem to make the API somewhat hostile to range based for loops. I'm
> surprised this isn't more a facility provided by the iterator...
>
Can you explain what loop you'd like to write?
Generally I'd like to avoid relying on iterators for exposing things other
than range based for loops, as the APIs tend to be undiscoverable and hard
to read.

> I find the name somewhat confusing too -- see my comment below for some
> clarity of why.
>
> But what's the advantage here? Why not just `arr[i].asNull()`?
>
The other advantage not yet mentioned is calling operator[] on pointers is
ugly:

if (Array* A = O.getArray("params"))
  if (auto D = A->getNumber(0))
    ...

vs

...
  if (auto D = (*A)[0].asNumber())
    ...

>
> ================
> Comment at: include/llvm/Support/JSON.h:266
> +
> +  // Typed accessors return None/nullptr if the Value is not of this type.
> +  llvm::Optional<std::nullptr_t> asNull() const {
> ----------------
> A more conventional name would be `getAsFoo` matching `getAs<T>`.
>
FWIW this is my least favorite rule in the LLVM style guide and I should
get around to proposing a change. For no-side-effect functions that return
a value, "get" is IMO just noise, and the requirement for a verb is
perverse when there is no action!

But as I haven't proposed such a change yet, I'll sadly fix my style here
:-)

Also, is it really worth having all of these? `getAs<bool>` and
> `getAs<double>` seem just as nice as the non-templated versions to me...
> But I guess `getAsString` and `getAsInteger` do interesting validation and
> such.
>
So I'm not sure what getAs<T> you're comparing to - I don't think I wrote
one. But I prefer separate functions:
 - a list of template specializations is harder to discover (by reading
headers, let alone code completion) than a list of functions.
 - I don't want to advertise or implement that the set of return types is
open or can be inferred - C++ has enough surprising conversions already!
 - getAsInteger returning int64_t captures intent better than
getAs<int64_t>, because the precision is chosen by the library
 - the names match the Value::Kind enum and the official JS names.

>
> Repository:
>   rL LLVM
>
> https://reviews.llvm.org/D45753
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180510/3aa3b709/attachment.html>