[PATCH] D96883: Add support for JSON output style to llvm-symbolizer

George Rimar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 9 05:12:58 PST 2021


grimar added a comment.

I apologize for my belated feedback. I am still on vacation, but I'd like to add my 5 cents about this particular patch to try to push it forward.

- Generally I think that adding a kind of `JSON` output to `LLVM` tools might be useful. This is not the first time I hear this idea, previously it was discussed that the output of llvm-readelf/readobj tools could be `XML`/`YAML`/`JSON`. In 2020 we fixed a lot of consistency issues in llvm-readelf's GNU output to make it more consistent with GNU readelf. It was done because it is a assumed to be a common task to parse the output of a tool and we have to provide some way to allow doing it anyways already. But ideally users shouldn't rely on GNU side and their output for LLVM tools, I think. It is brittle and we can provide a better experience.

- `JSON` vs `YAML` vs `<?>`. I think it should be `JSON`, because `YAML` is not well-known outside of `LLVM`. I probably see no any single reason why to use `YAML` instead. I also can't think of a more known/popular/convenient format for such task, so I guess that `JSON` is indeed a fine choice.

- It seems, that `llvm-symbolizer` is a good choice among LLVM tools where we can try to introduce the new machine readable output style and see if existent and potentially new users will be happy with it generally.

I've added some comments about the code, haven't looked at test cases closely yet.



================
Comment at: llvm/docs/CommandGuide/llvm-symbolizer.rst:279
 
+    $ llvm-symbolizer --output-style=JSON --obj=inlined.elf 0x4004be 0x400486
+    {"Frames":[
----------------
The new block is not on the right place?
I think it should go right after similars blocks for GNU/LLVM styles above (after `foo() at /tmp/test.cpp:6` at line 273).

I am also not sure it is useful to have a version **without **`--no-inlines` here: GNU and LLVM samples doesn't have it.
It is a documentation for `--output-style` option and not a test case for `--no-inlines`, so I perhaps see no reason to have it here.
Am I missing some intention?


================
Comment at: llvm/lib/DebugInfo/Symbolize/DIPrinter.cpp:36
+  J.objectBegin();
+  if (Info.Source.hasValue())
+    J.attribute("Source", Info.Source.getValue());
----------------
`Source` is `Optional<>`, which has `operator bool`, so you can omit `hasValue` here and in other places below.


================
Comment at: llvm/lib/DebugInfo/Symbolize/DIPrinter.cpp:37
+  if (Info.Source.hasValue())
+    J.attribute("Source", Info.Source.getValue());
+  if (Info.FunctionName != DILineInfo::BadString)
----------------
It is common not to use `Optional::getValue` in favor of dereference:

```
J.attribute("Source", *Info.Source);
```


================
Comment at: llvm/lib/DebugInfo/Symbolize/DIPrinter.cpp:48
+  if (Info.Column)
+    J.attribute("Column", Info.Column);
+  if (Info.Discriminator)
----------------
So, the `Column` key is only emitted when a column is not `0` (`Info.Column` is `uint32_t`). Is it expected behavior?
Is the intention is to omit the key to reduce the noise in the output? This needs a comment if so.


================
Comment at: llvm/lib/DebugInfo/Symbolize/DIPrinter.cpp:138
+    J.arrayBegin();
+    for (uint32_t I = 0; I < FramesNum; I++) {
+      OS << '\n';
----------------
It is preffered to use the pre-increment form where possible.


================
Comment at: llvm/lib/DebugInfo/Symbolize/DIPrinter.cpp:154
+  for (uint32_t I = 0; I < FramesNum; I++)
+    print(Info.getFrame(I), I > 0);
   return *this;
----------------
This change is unrelated to what this patch does? You should't mix cosmetic changes with functional when they are independent.


================
Comment at: llvm/lib/DebugInfo/Symbolize/DIPrinter.cpp:165
+    J.attribute("Start", int64_t(Global.Start));
+    J.attribute("Size", int64_t(Global.Size));
+    J.objectEnd();
----------------
Why do you convert `uint64_t` to `int64_t` here and in many places below?


================
Comment at: llvm/lib/DebugInfo/Symbolize/DIPrinter.cpp:183
+    J.arrayBegin();
+    for (DILocal Local : Locals) {
+      OS << '\n';
----------------
I'd suggest just `for (const DILocal& L : Locals)`


================
Comment at: llvm/lib/DebugInfo/Symbolize/DIPrinter.cpp:199
+        J.attribute("TagOffset", int64_t(Local.TagOffset.getValue()));
+      J.objectEnd();
+    }
----------------
The same comments as I've added for `void toJSON(...)` above applies here.


================
Comment at: llvm/lib/DebugInfo/Symbolize/DIPrinter.cpp:258
+  }
+  assert(0 && "Not implemented");
   return *this;
----------------
Can `DIPrinter::operator<<(const ErrorInfoBase &EI)` be called for non-JSON outputs?
If no, then having `assert` is not what you want here. you should use `llvm-unreachable` for the code that can't be reached.

At the same time, do you need this new method? I don't think that it is expected to be updated in future, probably.
And it feels that it is a caller job probably to handle errors properly. So, can it's logic be inlined to `printResOrErr`?


================
Comment at: llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp:89
+                          DIPrinter::OutputStyle OutputStyle,
+                          DIPrinter &Printer, bool DefIfErr = true) {
+  if (ResOrErr) {
----------------
The meaning of `DefIfErr` name is not very clear, also it is only used for non-JSON case,
what is a bit consusing I think, because the argument is just ignored for the `JSON` case.
It makes the signature to bit a bit dirty.

Perhaps, instead of having this function, I'd intoduce a helper like the following in `symbolizeInput`:

```
auto Print = [&](Expected<T> &ResOrErr){
  if (ResOrErr) {
    Printer << *ResOrErr;
    return;
  }

  if (OutputStyle == DIPrinter::OutputStyle::JSON) {
    handleAllErrors(std::move(ResOrErr.takeError()),
      [&](const ErrorInfoBase &EI) { Printer << EI; });
    return;
  }

  error(ResOrErr);

  // Nice and helpful comment about why the Command::Frame is exception....
  if (Cmd == Command::Frame)
     Printer << T();
};
```

Will it work?


================
Comment at: llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp:165
+      Printer << StringError(std::make_error_code(std::errc::invalid_argument),
+                             "(address)");
+    } else {
----------------
Can you use `createStringError` from `Error.h`?


================
Comment at: llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp:341
+      OutputStyle = DIPrinter::OutputStyle::LLVM;
+    }
   }
----------------
You don't need to use curly bracers for single lines (see https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96883/new/

https://reviews.llvm.org/D96883



More information about the llvm-commits mailing list