[PATCH] D114224: Add JSONScopedPrinter class

Fri Nov 19 01:01:17 PST 2021

jhenderson added a comment.

>> I think JSONScopedPrinter (and possibly ScopedPrinter too, if it doesn't already have them) could do with gtest unit tests. It should be fairly straightforward to write them, and would help give confidence in the methods, especially as we may not use all of them up-front. The tests would also help with documenting what the code actually does, which will aid in future reviewing.
>
> Will do! Do we want to test each method in isolation? or possibly in conjunction with each other to test things like the history stack? or possibly both?

Cross-posting this comment, since it's now applicable to this patch specifically: I think we probably need to do a bit of both: each method should be tested in isolation. I think we should ensure we have tests for the history stack, where it's applicable only. Basically, test the existing possible code paths, plus maybe one or two more interesting cases that test interaction between functions, if it seems sensible.

================
Comment at: llvm/include/llvm/Support/ScopedPrinter.h:466
+class JSONScopedPrinter : public ScopedPrinter {
+
+  enum Scope {
----------------
Nits:

1) I think we are normally explicit about public/private, so I'd add the explicit `private:` here, even though it's not strictly required.
2) I don't think we normally have blank lines at the start of class definitions.

================
Comment at: llvm/include/llvm/Support/ScopedPrinter.h:470-482
+  };
+  enum ScopeKind {
+    Standard,
+    Attribute,
+    NestedAttribute,
+  };
+  struct ScopeContext {
----------------
I'd probably add some blank lines between these things.

================
Comment at: llvm/include/llvm/Support/ScopedPrinter.h:486
+public:
+  JSONScopedPrinter(llvm::raw_ostream &OS)
+      : ScopedPrinter(OS, ScopedPrinter::ScopedPrinterKind::JSON), JOS(OS, 2) {}
----------------
I believe the explicit `llvm` is unnecessary.

================
Comment at: llvm/include/llvm/Support/ScopedPrinter.h:492
+  void printNumber(StringRef Label, uint64_t Value) override {
+    JOS.attribute(Label, to_string(Value));
+  }
----------------
How come we need to stringify the value here? That seems surprising to me, and may lead to undesirable behaviour in my mind (e.g. a field for an ELF struct might be a number for the 32-bit ELF format, but a string in the 64-bit ELF format (and not all numbers will be strings in the latter-case, where they are not 64-bits)).

I have this vague memory that the LLVM JSON format doesn't support 64-bit numbers. If so, I think this is a mistake and should be fixed.

================
Comment at: llvm/include/llvm/Support/ScopedPrinter.h:524-526
+    SmallString<64> StringValue;
+    Value.toString(StringValue);
+    JOS.attribute(Label, StringValue);
----------------
Similar point to the above, but slightly different: I think we should compare this to `std::numeric_limits<uint64_t>::max()` and print as a number, if possible, rather than as a string.

================
Comment at: llvm/include/llvm/Support/ScopedPrinter.h:539
+
+  void objectBegin() override {
+    JOS.objectBegin();
----------------
Similar to my comments in D114223, I'd recomend adding helper functions to avoid some of the logic duplication in the following methods.

================
Comment at: llvm/include/llvm/Support/ScopedPrinter.h:638
+  void printHexImpl(StringRef Label, HexNumber Value) override {
+    JOS.attribute(Label, "0x" + to_hexString(Value.Value));
+  }
----------------
I was thinking about this, and I think we shouldn't actually print as hex in JSON, and should fallback to the standard number printing. Here's the reasoning: most of the time, we use `printHex` to print some number that represents an offset, or other value in a fixed-sized field, that is easier to read as hex. However, JSON format is not intended for human consumption, at least not in our use-case for this code. As such, readability is of lesser concern than parseability. If we were to store hex numbers as strings, rather than converting them to their decimal counterparts, we'd end up with some numeric values as strings, as others as integers, in a seemingly arbitrary manner to the end-user. I don't think that this is desirable.

================
Comment at: llvm/include/llvm/Support/ScopedPrinter.h:648
+                             HexNumber Value) override {
+    JOS.attribute(Label, Symbol.str() + "+0x" + to_hexString(Value.Value));
+  }
----------------
I would actually print this as two separate attributes: a symbol name, and a numeric offset. That'll be easier for consumers to parse. Perhaps these should be in a separate object, to avoid surprise name clashes. Rather than:
```
"Label":"Symbol+0x100"
```
or
```
"Label":"Symbol",
"LabelOffset":256
```
do
```
"Label":{
  "SymName":"Symbol",
  "Offset":256
}
```

================
Comment at: llvm/include/llvm/Support/ScopedPrinter.h:654
+    JOS.attribute(Label, Str);
+    JOS.attribute(Label.str() + "_raw", Value);
+  }
----------------
You should check through the existing usages, but at least in llvm-readelf, I think most labels use UpperCameCase style, so printing as `MyValueRaw` would look a little cleaner. Also, see my above comments re. symbol offset about a nested object/naming clash risk, i.e. I'd recommend:
```
"Label":{
  "Value":1234,
  "RawValue":4321
}
```

================
Comment at: llvm/include/llvm/Support/ScopedPrinter.h:659
+                       bool Block, uint32_t StartOffset = 0) override {
+    // TODO: Implement
+  }
----------------
Why's this being left as a todo?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114224/new/

https://reviews.llvm.org/D114224