[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

Mon Jun 5 16:21:34 PDT 2017

On Mon, Jun 5, 2017 at 2:27 PM, David Blaikie <dblaikie at gmail.com> wrote:

> I know there's been a bunch of discussion here already, but I was
> wondering if perhaps someone (probably Teresa? Peter?) could:
>

Sure

>
> 1) summarize the current state
>

Currently the summary is not dumped in the llvm-dis output. The only way to
see it (other than some YAML dumping Peter added for the type test summary
info), is to use llvm-bcanalyzer -dump, which is not easily readable.

2) describe the end-goal
>

At the very least, dumping it in human readable form, ideally in the
llvm-dis output (although dumping it separately may be useful as well), so
that one can do verification/visual inspection, and also help get an
understanding of why importing and other whole program optimizations are
(not) being applied.

A further possible end goal is to be able to then construct the summary
from this textual form, rather than reconstructing from the IR. This can be
useful for testing purposes (as in the type test summary case Peter added),
i.e. test what happens if I adjust the summary info. But other questions
need to be worked through first - i.e. what happens if the user hand
changes the textual IR but not the textual summary - how do we detect that
the summary needs to be rebuilt vs read in and used as is...

3) describe what steps (& how this patch relates) are planned to get to (2)
>

Charles had a draft patch to dump out the summary in textual form along
with the IR from llvm-dis. Peter and Mehdi suggested repurposing/extending
the YAML summary dumping (which currently only dumps the type test / devirt
summary, and only under certain options and to stdout), to dump the rest of
the summary fields and into the llvm-dis output, instead of having an
additional dumper.

As a first step, to avoid the issue of what to do when reading back in via
llvm-as, I suggested dumping as LLVM assembly comments (to make it clear it
is not parsed back in).

> My naive thoughts, not being intimately familiar with any of this: Usually
> bitcode and textual IR support go in together or around the same time, and
> designed that way from the start (take r211920 for examaple, which added an
> explicit representation of COMDATs to the IR). This seems to have been an
> oversight in the implementation of IR summaries (is that an accurate
> representation/statement?) & now there's an effort to correct that.
>

Yes

>
> So it seems like that would start with a discussion of what the right
> end-state would be: What the syntax in textual IR should be, then
> implementing it. I can understand implementing such a thing in steps - it's
> perhaps more involved than the COMDAT situation. In that case starting on
> either side seems fine - implementing the emission first (hidden behind a
> flag, so as not to break round-tripping in the interim) or the parsing
> first (no need to hide it behind any flags - manually written examples can
> be used as input tests).
>

Yep, and Charles's original email here was meant to discuss the desired
textual IR dumping format.

> (& it sounds like there's some partially implemented functionality using a
> YAML format that was intended to address how some test cases could be
> written? & this might be a good basis for the syntax - but seems to me like
> it might be a bit disjointed/out of place in the textual IR format that's
> not otherwise YAML-based?)
>

Right, that was Peter and Mehdi's suggestion. It will look a bit different
than the rest of the textual IR, but maybe that doesn't matter since this
isn't IR.

Teresa

>
> - Dave
>
> On Fri, Jun 2, 2017 at 8:46 AM Charles Saternos via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hey all,
>>
>> Below is the proposed format for the dump of the ThinLTO module summary
>> in the llvm-dis utility:
>>
>> > ../build/bin/llvm-dis t.o && cat t.o.ll
>> ; ModuleID = '2.o'
>> source_filename = "2.ll"
>> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>> target triple = "x86_64-unknown-linux-gnu"
>>
>> @X = constant i32 42, section "foo", align 4
>>
>> @a = weak alias i32, i32* @X
>>
>> define void @afun() {
>>   %1 = load i32, i32* @a
>>   ret void
>> }
>>
>> define void @testtest() {
>>   tail call void @boop()
>>   ret void
>> }
>>
>> declare void @boop()
>>
>> ; Module summary:
>> ;  testtest (External linkage)
>> ;    Function (2 instructions)
>> ;    Calls: boop
>> ;  X (External linkage)
>> ;    Global Variable
>> ;  afun (External linkage)
>> ;    Function (2 instructions)
>> ;    Refs:
>> ;      a
>> ;  a (Weak any linkage)
>> ;    Alias (aliasee X)
>>
>> I've implemented the above format in the llvm-dis utility, since there
>> currently isn't really a way of getting ThinLTO summaries in a
>> human-readable format.
>>
>> Let me know what you think of this format, and what information you think
>> should be added/removed.
>>
>> Thanks,
>> Charles
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>

-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170605/7a66305c/attachment.html>