[llvm-dev] [EXTERNAL] Re: LLVM Symbolizer improvements

Tue Feb 2 14:43:25 PST 2021

Hello everyone, 

Sorry for the late response. Thanks for comments.

David, you are right, we do not really need both JSON and YAML.
It seems YAML is easier to add by reusing the existing YAML support in LLVM.

As Fangrui mentioned, the output of llvm-symbolizer is line based and it is used in many projects.
Our users are using the current llvm-symbolizer implementation for CLI automation.
And they prefer to build and expand a dependency on some standard format.
We expect users requesting more data and additional fields.
Seems YAML or JSON is better suitable for supporting reasonable backward compatibility,
and providing a better ground for CLI automation, including llvm-symbolizer tests.

Thanks,
Alex
________________________________________
From: Fangrui Song <maskray at google.com>
Sent: Monday, January 25, 2021 2:08 PM
To: Alex Orlov
Cc: David Blaikie; llvm-dev at lists.llvm.org; Fangrui Song; Xuanda Yang; George Rimar
Subject: [EXTERNAL] Re: [llvm-dev] LLVM Symbolizer improvements

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.  If you suspect potential phishing or spam email, report it to ReportSpam at accesssoftek.com

(Thanks David for CCing llvm-dev.)

On 2021-01-25, David Blaikie via llvm-dev wrote:
>Any particular reason for both json and yaml? Isn't one a subset of
>the other anyway?
>
>& what's the use-case? What sort of consumers do you have in mind?
>
>On Mon, Jan 25, 2021 at 12:00 PM Alex Orlov <aorlov at accesssoftek.com> wrote:
>>
>> Hello everyone,
>>
>> We are looking in to extend DIPrinter::OutputStyle { LLVM, GNU } to { LLVM, GNU, JSON, YAML }
>> and update DIPrinter to support these machine-readable printouts.
>>
>> I will propose patches shortly, unless someone has any objection.
>>
>> Thanks,
>> Alex

The output of llvm-symbolizer is line based. While some may argue that it is
brittle, in reality it has been used by several projects (e.g. pprof,
asan_symbolize.py, and various projects parsing addr2line output) in production.

If neither --output-style=GNU nor --output-style=LLVM suit your needs,
you can also try --verbose.

% llvm-symbolizer --output-style=LLVM -e Inputs/discrim 0x400590 0x400575 --verbose
foo
   Filename: /tmp/discrim.c
   Function start filename: /tmp/discrim.c
   Function start line: 4
   Line: 5
   Column: 7
main
   Filename: /tmp/discrim.c
   Function start filename: /tmp/discrim.c
   Function start line: 9
   Line: 10
   Column: 0

foo
   Filename: /tmp/discrim.c
   Function start filename: /tmp/discrim.c
   Function start line: 4
   Line: 5
   Column: 17
   Discriminator: 2

If you want more information, than the command line utility llvm-symbolizer may not suit.
You can probably use the lib/DebugInfo/Symbolize library directly.