[PATCH] D45753: Lift JSON library from clang-tools-extra/clangd to llvm/Support.

Sam McCall via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 20 10:06:38 PDT 2018


sammccall marked 4 inline comments as done.
sammccall added a comment.
Herald added a subscriber: jkorous.

In https://reviews.llvm.org/D45753#1070724, @labath wrote:

> I'm not sure who should be the main person reviewing this,


I'm also not sure, but I really appreciate your feedback, and wanted to give you as a likely user a chance to shape the API.
Thanks for the great advice, don't feel any pressure to accept or examine everything if you don't feel like the right person.

> but I think the implementation looks pretty good, and would be a great replacement for the one we have in lldb. The main thing I noticed is that you seem to be rolling your own utf8 parser -- I would hope we can reuse the existing unicode utilities here.

Thanks for pointing that out, I hadn't seen them, and you're right. (At the same time, ick!)

>> fromJSON returns only bool. A richer error-signaling mechanism may be useful to provide useful messages, or let recursive fromJSONs (containers/structs) do careful error recovery.
> 
> For our (lldb) use case, we don't need a fancy error mechanism. It seems it should be possible to make these return `llvm::Error` if it turns out to be necessary later.
> 
>> should json::obj be always explicitly written (like json::ary)
> 
> I don't have an opinion on that, though I do think that `json::ary` should be renamed.

Agree, these were a wart. `ary` -> `Array`, `obj` -> `Object`, `Expr` -> `Value`.

>> there's no streaming parse API. I suspect there are some simple wins like a callback API where the document is a long array, and each element is small. But this can probably be bolted on easily when we see the need.
> 
> All the messages we are parsing are reasonably small (<= 1k), so we don't have a need for a streaming parser right now. The parser is essentially streaming already. If there was something like llvm::raw_istream, it could be trivially converted, but it looks like input streams aren't a thing for llvm in general, so I wouldn't be worried about it here.

Yeah. Rather than the input data, I was more thinking of passing the output objects to a callback without waiting to pass the whole containing array/object into memory. But agree it's a thing for later.



================
Comment at: include/llvm/Support/JSON.h:58
+// Array and Object also have typed indexing accessors for easy traversal:
+//   Expected<Expr> E = parse(R"( {"options": {"font": "sans-serif"}} )");
+//   if (json::obj* O = E->asObject())
----------------
labath wrote:
> It would be good to emphasize that the returned object's lifetime is independent of the string being parsed (i.e. the contained strings are not references to the strings in the original text).
Done (on the function doc for `parse()'


================
Comment at: include/llvm/Support/JSON.h:449-450
+// Give Expr::{Object,Array} more convenient names for literal use.
+using obj = Expr::ObjectExpr;
+using ary = Expr::ArrayExpr;
+
----------------
labath wrote:
> I find it a bit inconsistent when I see a `json::Expr` (uppercase) for the base type and then `json::obj` (lowercase) for the derived ones. The lowercase names don't really follow the naming convention and `ary` is also fairly un-obvious. Could we just call these `Object` and `Array` ?
Done. These are now defined as `json::Object` and `json::Array`. Since Value already had enumerator with these names, I un-nested the classes which required reordering some things and defining functions out-of-line.
I put them in the cpp file for readability reasons, they could move to the bottom of the header if we care about inlining.

While renaming, also changed `json::Expr` -> `json::Value` which seems more accurate.


================
Comment at: lib/Support/JSON.cpp:283-364
+void Parser::encodeUtf8(uint32_t Rune, std::string &Out) {
+  if (Rune <= 0x7F) {
+    Out.push_back(Rune & 0x7F);
+  } else if (Rune <= 0x7FF) {
+    uint8_t FirstByte = 0xC0 | ((Rune & 0x7C0) >> 6);
+    uint8_t SecondByte = 0x80 | (Rune & 0x3F);
+    Out.push_back(FirstByte);
----------------
labath wrote:
> Is there any reason `Support/ConvertUTF.h` cannot be used here? Is sounds like you just need the "lenient" conversion mode here.
Done. That's really slow code with a really inconvenient API, but it shouldn't matter.


================
Comment at: lib/Support/JSON.cpp:394
+
+namespace {
+void quote(llvm::raw_ostream &OS, llvm::StringRef S) {
----------------
labath wrote:
> llvm coding standards say we use static for functions instead of anonymous namespaces. Also, the llvm and json namespaces are opened and closed twice, so this may be a good opportunity to merge them.
Switched to static and removed the namespace (enum doesn't need to be in it).

The format_provider specialization does need to be outside the namespace and there is some order dependency around there, but I'vo avoided opening/closing a lot by qualifying definition names.


Repository:
  rL LLVM

https://reviews.llvm.org/D45753





More information about the llvm-commits mailing list