[PATCH] D96883: Add support for JSON output style to llvm-symbolizer

Tue Apr 27 03:21:35 PDT 2021

jhenderson added a comment.

Out of time for today. Will come back to this another time.

================
Comment at: llvm/docs/CommandGuide/llvm-symbolizer.rst:256-258
+  ``JSON`` style provides a machine readable output in JSON.
+    Please note that if addresses are given as command (line) arguments 
+    the output JSON will contain an array of the results for all of the given addresses.
----------------
Make this all one paragraph, and reflow to 80 character limit. I'd actually rephrase it slightly too:

"If addresses are supplied via stdin, the output JSON will be a series of individual objects. Otherwise, all results will be contained in a single array."

================
Comment at: llvm/docs/CommandGuide/llvm-symbolizer.rst:331-333
+  You can use the - pretty-print option to get a nicely formatted JSON 
+  when using the corresponding output style, otherwise it is intended for 
+  machine parsing and has a compact form.
----------------
I'd rephrase as in the inline comment (please make sure to reflow if necessary).

================
Comment at: llvm/docs/CommandGuide/llvm-symbolizer.rst:279

+    $ llvm-symbolizer --output-style=JSON --obj=inlined.elf 0x4004be 0x400486
+    [{"Address":"0x4004be","ModuleName":"inlined.elf","Symbol":[
----------------
jhenderson wrote:
> Maybe use the `--pretty-print` option to make this example a bit more readable?
Sorry, use `-p`, not `--pretty-print` (I just realised that's what's used in the previous example).

================
Comment at: llvm/test/tools/llvm-symbolizer/output-style-json-code.test:3-4
+
+## If the addresses are specified in the command line,
+## the output JSON will contain an array of the results for all of the given addresses.
+
----------------
Please reflow this comment to 80-character limit.

================
Comment at: llvm/test/tools/llvm-symbolizer/output-style-json-code.test:6
+
+## Handle symbolize library error - file does not exist.
+# RUN: llvm-symbolizer --output-style=JSON -e %p/no-file.exe 0 | \
----------------

================
Comment at: llvm/test/tools/llvm-symbolizer/output-style-json-code.test:11-12
+
+## Resolve out of range address, no-inlines.
+## Expected one empty object with default values printed in a list for unification.
+# RUN: llvm-symbolizer --output-style=JSON -e %p/Inputs/addr.exe 1000000000 --no-inlines | \
----------------
As before, do we need both `--no-inlines` and `--inlines` cases?

================
Comment at: llvm/test/tools/llvm-symbolizer/output-style-json-code.test:32
+## Resolve valid address, no-inlines.
+# NO-INLINES-NEXT:{"Address":"0x40054d","ModuleName":"{{.*}}/Inputs/addr.exe","Symbol":[{"Column":3,"Discriminator":0,"FileName":"/tmp{{/|\\\\}}x.c","FunctionName":"main","Line":3,"Source":"","StartFileName":"/tmp{{/|\\\\}}x.c","StartLine":2}]}
+
----------------
aorlov wrote:
> jhenderson wrote:
> > It would be nice to have one or more test cases with a non-zero discriminator value. Also, where the "Source" parameter is non-empty.
> > 
> > You can simplify `{{/|\\\\}}` to this: `{{[\\/]}}`. I think it's slightly more readable due to avoiding the quadruple backslash. Same goes elsewhere below.
> I have added the test for a non-zero discriminator. Note the address is hardcoded. It is more correct to use /Inputs/discrim.inp via stdin, but I just copied the address from discriminator.test.
> It seems currently we have no binaries in llvm/test/tools/llvm-symbolizer/Inputs with a valid Source info.
> > You can simplify
> No, it does not work because JSON has own rules for escaping special symbols in paths.
> 
> It seems currently we have no binaries in llvm/test/tools/llvm-symbolizer/Inputs with a valid Source info.

Consider generating one at test time using assembly or yaml2obj.

> > You can simplify
> No, it does not work because JSON has own rules for escaping special symbols in paths.

Okay - I missed the double backslash.

================
Comment at: llvm/test/tools/llvm-symbolizer/output-style-json-code.test:49
+
+## Also check the last 3 test cases with llvm-adr2line. The expected result is the same but missing the FunctionName.
+# RUN: llvm-addr2line --output-style=JSON -i -e %p/Inputs/addr.exe < %p/Inputs/addr.inp | \
----------------
aorlov wrote:
> jhenderson wrote:
> > I guess the obvious question is: why do we have a difference in behaviour surrounding the `FunctionName` attribute? The motivation for different symbolizer/addr2line output is because llvm-addr2line needs to be compatible with GNU addr2line, but that principle doesn't apply for JSON output (which AFAIK is not a GNU addr2line supported feature).
> This behavior is a part of llvm-symbolizer and does not depend on the output styles and does not belong to the printer (look at decideHowToPrintFunctions()). 
> Not sure if I understand what you are saying. Changing the behavior is out of the scope of this patch. 
> I agree that it does not make much sense in testing that, but one of the reviewers requested these tests.
> 
Ah, I think we've hit on a fundamental question regarding JSON output - what should the options that change what information is printed do to JSON output? I'm thinking here specifically both `--functions` and `--addresses`, but it may apply to others too.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96883/new/

https://reviews.llvm.org/D96883