[PATCH] D45753: Lift JSON library from clang-tools-extra/clangd to llvm/Support.

Fri Apr 27 07:20:44 PDT 2018

simon_tatham added inline comments.

================
Comment at: lib/Support/JSON.cpp:560
+    case T_Number:
+      OS << format("%g", as<double>());
+      break;
----------------
sammccall wrote:
> simon_tatham wrote:
> > Could this number formatting be changed? The default %g loses precision – you don't even get enough information to exactly reconstruct the same double you started with.
> > 
> > Also, over in D46054 I'm working on a JSON back end for TableGen, for which I'd find it useful to be able to pass an arbitrary 64-bit integer through this system and still have the full 64 bits of integer value visible in the JSON output file, for the benefit of JSON consumers (e.g. Python `json.load`) that go above the call of duty in returning it as an integer without rounding it to the nearest representable double.
> > 
> > So, would it be possible to have some method of constructing a json::Value that formats as a 64-bit integer literal?
> > Could this number formatting be changed? The default %g loses precision – you don't even get enough information to exactly reconstruct the same double you started with.
> 
> Yes, though is there an existing aware of round-trip safe double formatter in llvm?
> I suspect this only actually matters when the values are integers, so we should consider your second suggestion first :-)
> 
> > I'd find it useful to be able to pass an arbitrary 64-bit integer through this system and still have the full 64 bits of integer value visible in the JSON output file, for the benefit of JSON consumers (e.g. Python json.load) that go above the call of duty in returning it as an integer without rounding it to the nearest representable double.
> 
> What about this design:
> - Internally, a numeric value can be an integer or a double. i.e we split the internal `ValueType` `T_Number` into two, `T_Integer` and `T_Double`. public `Kind` remains unchanged.
> - when constructing, you get one or the other depending on the static type
> - when parsing, you get integer unless it has a nonzero decimal part or is out-of-range.
> - `asDouble()` always succeeds
> - `asInteger()` succeeds if the underlying value is integer or if it's a double that can be exactly represented as `int64_t` (same as now)
> - when serializing, you get `%g` for double and the usual representation for integers
> 
> Open questions:
> - is 1.2e3 a double or an integer? I kind of want the former, which complicates our heuristic.
> - `int64_t` leaves anyone who wants `uint64_t` out in the cold. But adding more options for types is going to lead to madness. Can we live with this limitation?
> 
> If this sounds good I can start on the changes, but I'd like to defer adding new features to another patch if that's OK. This one is largely moving mostly-battleworn code from clangd, and new features need closer review of the implementation.
That design certainly sounds as if it would do what I need, and a great deal more besides.

Another option that would be fine for me personally would be to have a means of constructing a `ValueType` called, say, `T_Custom`, which internally holds a string value, and serializes as exactly that string, unquoted. I could imagine that being used for other unusual purposes as well, such as controlling which of `\uXXXX` and UTF-8 was used to represent a non-ASCII character in a string literal.

(And that possibility is simple enough that I could add it myself as part of //my// patch.)

Repository:
  rL LLVM

https://reviews.llvm.org/D45753