[PATCH] D96883: Add support for JSON output style to llvm-symbolizer

George Rimar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 10 04:27:28 PST 2021


grimar added a comment.

Few more comments. I think that error reporting could be better, we should probably use
the power of `Expected<>` and show the real errors, rather than hiding them under standard messages for error codes.

I've suggested few changes. With them, for example, the following error

`{"Error":{"Code":22,"Message":"{{I|i}}nvalid argument (address)"}}`

changes to

`{"Error":{"Code":22,"Message":"unable to parse arguments: DATA Z:\\Work3\\LLVM\\llvm-project\\build\\test\\tools\\llvm-symbolizer\\Output\\output-style-json-data.test.tmp.o Z"}}`

It shows the main idea: we can build own error messages with a full context.

Also, I am not sure if having a `Code` field is usefull? The main information about a error is holded by the `Message` key. So we might think about
removing it. It should simplify the code slightly. I have no stong opinion atm though. Is it is a usefull field?



================
Comment at: llvm/docs/CommandGuide/llvm-symbolizer.rst:275
 
+    $ llvm-symbolizer --output-style=JSON --obj=inlined.elf 0x4004be 0x400486
+    {"Frames":[
----------------
Note, my previous comment says:
"I am also not sure it is useful to have a version **without** --no-inlines here: GNU and LLVM samples doesn't have it."

I.e. the JSON sample is slightly inconsistent now with GNU/LLVM. At the same time, as I've mentioned,
it is not a documentation for `--no-inlines`, so it is perhaps fine with me. May be other people have a more strong opinion
(I am not sure, why `--no-inlines` was used for GNU/LLVM first of all).


================
Comment at: llvm/lib/DebugInfo/Symbolize/DIPrinter.cpp:252
+    J.attributeObject("Error", [&] {
+      J.attribute("Code", int64_t(EI.convertToErrorCode().value()));
+      J.attribute("Message", EI.message());
----------------
You don't need the cast here, because `value()` is `int`?


================
Comment at: llvm/lib/DebugInfo/Symbolize/DIPrinter.cpp:48
+  if (Info.Column)
+    J.attribute("Column", Info.Column);
+  if (Info.Discriminator)
----------------
grimar wrote:
> So, the `Column` key is only emitted when a column is not `0` (`Info.Column` is `uint32_t`). Is it expected behavior?
> Is the intention is to omit the key to reduce the noise in the output? This needs a comment if so.
Thanks for adding the comment. You should probably also add an explicit test case to test/document the output when the `Column` value is `0` and the case when it is not.



================
Comment at: llvm/lib/DebugInfo/Symbolize/DIPrinter.cpp:165
+    J.attribute("Start", int64_t(Global.Start));
+    J.attribute("Size", int64_t(Global.Size));
+    J.objectEnd();
----------------
aorlov wrote:
> grimar wrote:
> > Why do you convert `uint64_t` to `int64_t` here and in many places below?
> There is no json::OStream::attribute() version for uint32_t and uint64_t. 
> I can use an implicit conversion for uint32_t, but I need to convert uint64_t to int64_t explicitly.
> 
But this is simply wrong and might print a wrong result isn't?

E.g. imagine you do:

```
    uint64_t XXX = 0xffffffffeeeeeeee;
    J.attribute("Start", int64_t(XXX));
```

The output is:

```
{"Name":"foo","Start":-286331154
```

I think `Start` shouldn't have a negative value.
Seems you need to update the `json::OStream` implementation to fix it.
You also need to add a test for such situation(s).

Also. should `Start`/`Size` be printed as hex? I think it is very common to use hex for addresses/sizes.


================
Comment at: llvm/lib/DebugInfo/Symbolize/DIPrinter.cpp:258
+  }
+  assert(0 && "Not implemented");
   return *this;
----------------
grimar wrote:
> Can `DIPrinter::operator<<(const ErrorInfoBase &EI)` be called for non-JSON outputs?
> If no, then having `assert` is not what you want here. you should use `llvm-unreachable` for the code that can't be reached.
> 
> At the same time, do you need this new method? I don't think that it is expected to be updated in future, probably.
> And it feels that it is a caller job probably to handle errors properly. So, can it's logic be inlined to `printResOrErr`?
> At the same time, do you need this new method?
> ...
> And it feels that it is a caller job probably to handle errors properly.

I've debugged it and now I think that creating the JSON with the error in `DIPrinter` is fine,
but I'd not implement it as `operator<<`, because it is not a regular output operation,
and will be only useful for `JSON` it seems.

So I'd suggest something like:

```
void DIPrinter::printErrorJSON(const Twine& Msg, std::error_code EC) {
  json::OStream J(OS);
  J.objectBegin();
  J.attributeObject("Error", [&] {
    J.attribute("Code", int64_t(EC.value()));
    J.attribute("Message", Msg.str());
  });
  J.objectEnd();
  OS << '\n';
}
```


================
Comment at: llvm/test/tools/llvm-symbolizer/output-style-json-data.test:17
+
+# NO-FILE:{"Error":{"Code":2,"Message":"{{N|n}}o such file or directory"}}
+
----------------
This error message is probably should be tested differently:
You should use `FileCheck -DMSG=%errc_ENOENT` I think (see, e.g D95246).


================
Comment at: llvm/test/tools/llvm-symbolizer/output-style-json-frame.test:41
+# CHECK-NEXT:,{"FunctionName":"f","Name":"k","DeclFile":"/tmp{{/|\\\\}}frame-types.cpp","DeclLine":15,"FrameOffset":-57,"Size":12}
+# CHECK-NEXT:,{"FunctionName":"f","Name":"l","DeclFile":"/tmp{{/|\\\\}}frame-types.cpp","DeclLine":16,"FrameOffset":-345,"Size":288}]
+
----------------
Why do you need to have so many locals? The only visible difference in the output is the value of "FrameOffset".
Can we have only two/three? This would reduce the size of the code below significantly I guess.


================
Comment at: llvm/test/tools/llvm-symbolizer/output-style-json-frame.test:514
+	.section	.debug_macinfo,"", at progbits
+	.byte	0                       # End Of Macro List Mark
+
----------------
You don't need this section.


================
Comment at: llvm/test/tools/llvm-symbolizer/output-style-json-frame.test:518
+	.section	".note.GNU-stack","", at progbits
+	.addrsig
+	.section	.debug_line,"", at progbits
----------------
You don't need

```
  .ident	"clang version 9.0.0 "
  .section	".note.GNU-stack","", at progbits
```

for this test, so it can be removed.
Please try to avoid having excessive pieces in test cases, when it is easy not to have them.


================
Comment at: llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp:179
   if (Cmd == Command::Data) {
     auto ResOrErr = Symbolizer.symbolizeData(
         ModuleName, {Offset, object::SectionedAddress::UndefSection});
----------------
Consider replacing the `auto` to actual types in this method, because you are touching the related neighbouring lines,
which are hard to read, because it is not clear what type `ResOrErr` has.


================
Comment at: llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp:89
+                          DIPrinter::OutputStyle OutputStyle,
+                          DIPrinter &Printer, bool DefIfErr = true) {
+  if (ResOrErr) {
----------------
aorlov wrote:
> grimar wrote:
> > The meaning of `DefIfErr` name is not very clear, also it is only used for non-JSON case,
> > what is a bit consusing I think, because the argument is just ignored for the `JSON` case.
> > It makes the signature to bit a bit dirty.
> > 
> > Perhaps, instead of having this function, I'd intoduce a helper like the following in `symbolizeInput`:
> > 
> > ```
> > auto Print = [&](Expected<T> &ResOrErr){
> >   if (ResOrErr) {
> >     Printer << *ResOrErr;
> >     return;
> >   }
> > 
> >   if (OutputStyle == DIPrinter::OutputStyle::JSON) {
> >     handleAllErrors(std::move(ResOrErr.takeError()),
> >       [&](const ErrorInfoBase &EI) { Printer << EI; });
> >     return;
> >   }
> > 
> >   error(ResOrErr);
> > 
> >   // Nice and helpful comment about why the Command::Frame is exception....
> >   if (Cmd == Command::Frame)
> >      Printer << T();
> > };
> > ```
> > 
> > Will it work?
> No, because of template <typename T>. I have renamed DefIfErr to BackwardCompatibleErr.
> I have no idea why there is no printout of the empty struct in case of Cmd == Command::Frame, probably it is a bug.
> But it is unrelated to this patch. I just keep the current behavior for other OutputStyle.
> I have no idea why there is no printout of the empty struct in case of Cmd == Command::Frame, probably it is a bug.
> But it is unrelated to this patch. I just keep the current behavior for other OutputStyle.

It is true, but probably we shouldn't introduce a new strangely named `BackwardCompatibleErr` variable in this case.
We know that `Command::Frame`, one of commands,  has a different specific behavior.
I'd write the code to make it more obvious with a `Cmd` argument.
Also, perhaps would add few early returns. Something like:


```
template <typename T>
static void printResultOrError(Expected<T> &ResOrErr,
                               DIPrinter::OutputStyle OutputStyle,
                               DIPrinter &Printer, Command Cmd) {
  if (ResOrErr) {
    Printer << ResOrErr.get();
    return;
  }

  if (OutputStyle == DIPrinter::OutputStyle::JSON) {
    Printer.printErrorJSON(toString(ResOrErr.takeError()),
                           std::make_error_code(std::errc::invalid_argument));
    return;
  }

  if (Cmd != Command::Frame)
    Printer << T();
}
```


================
Comment at: llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp:165
+      Printer << StringError(std::make_error_code(std::errc::invalid_argument),
+                             "(address)");
+    } else {
----------------
aorlov wrote:
> grimar wrote:
> > Can you use `createStringError` from `Error.h`?
> No, because it is hard to use Error to get the error code and message just for printout. 
> I'm using ErrorInfoBase as a holder instead.
If we introduce the `printErrorJSON` that I suggested in a different comment, then here it will be possible to write:

```
Printer.printErrorJSON("unable to parse arguments: " + InputString,
  std::make_error_code(std::errc::invalid_argument));
```

What is better, because allows to provide more context and more customizable error messages for JSON output.
What do you think?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96883/new/

https://reviews.llvm.org/D96883



More information about the llvm-commits mailing list