[llvm-dev] [RFC][binutils] Machine-readable output from Binutils - possible GSOC project?

Fri Jan 10 06:07:40 PST 2020

Disclaimer:  I'm sat a few desks away from James in a related team,
although I don't think that we've actually ever discussed this topic at all.

> Are people still interested in this? If so, what is the typical use case
you’d use the result of this project for?

Yes.  We have a test framework that extracts a load of metrics from various
large codebases (generally games) built with different toolchain revisions
and stores them in a database for analysis and visualization.  For example,
we get section sizes from llvm-readelf output via regular expression
parsing into json format for database submission.

> Why would this be better than the existing llvm-readobj output (if
applicable)?

Because it makes me sad to see things like this in my test framework:

^\[\s*(?P<id>\d+)\]\s(?P<section>.+?)\s+(\w+)\s+(\w+)\s+(\w+)\s+(?P<size_hex>\w+)\s.+$
It's far from resilient.

As a comparison we also get metrics from running "llvm-dwarfdump
--statistics" which outputs json so no need for any custom parsing and is
quite lovely.

> Is there a priority for a specific format (e.g. ELF, DWARF, COFF)?

In my case ELF and DWARF are the focus.

> Would anybody be interested in co-mentoring such a project?

I'm very happy to provide input as a potential consumer of the data.
Whether that extends as far as co-mentoring I don't mind either way.

-Greg

On Fri, 10 Jan 2020 at 11:56, James Henderson via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hi all,
>
>
>
> I was giving some thought as to possible project ideas I could propose for
> this year’s Google Summer of Code, with regards to the LLVM Binutils. One
> idea that I had was something discussed at last year’s Euro LLVM developer
> meeting, namely machine-readable output from the LLVM Binutils. Before I
> actually start advertising this as an open project, I wanted to ask a few
> questions:
>
>
>
>    1. Are people still interested in this? If so, what is the typical use
>    case you’d use the result of this project for? Why would this be better
>    than the existing llvm-readobj output (if applicable)?
>    2. Which tool(s) and feature(s) would you most want this for? I
>    personally think this should just be another output style for llvm-readobj.
>    Does anybody have any different opinion there?
>    3. Is there any additional tooling in relation to this project that
>    you think would be important to be a part of this project, e.g. a lit
>    function to query the output?
>    4. How might this interact with obj2yaml? Could the new output
>    ultimately be used to replace it?
>    5. Is there a priority for a specific format (e.g. ELF, DWARF, COFF)?
>    6. Would anybody be interested in co-mentoring such a project?
>
>
>
> Thanks in advance for the comments!
>
>
>
> James
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200110/bf7b1ad6/attachment.html>