[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

Mon Jun 5 16:33:21 PDT 2017

On Mon, Jun 5, 2017 at 4:21 PM Teresa Johnson <tejohnson at google.com> wrote:

> On Mon, Jun 5, 2017 at 2:27 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>> I know there's been a bunch of discussion here already, but I was
>> wondering if perhaps someone (probably Teresa? Peter?) could:
>>
>
> Sure
>

Thanks!

1) summarize the current state
>>
>
> Currently the summary is not dumped in the llvm-dis output. The only way
> to see it (other than some YAML dumping Peter added for the type test
> summary info), is to use llvm-bcanalyzer -dump, which is not easily
> readable.
>
> 2) describe the end-goal
>>
>
> At the very least, dumping it in human readable form, ideally in the
> llvm-dis output (although dumping it separately may be useful as well), so
> that one can do verification/visual inspection, and also help get an
> understanding of why importing and other whole program optimizations are
> (not) being applied.
>
> A further possible end goal is to be able to then construct the summary
> from this textual form, rather than reconstructing from the IR. This can be
> useful for testing purposes (as in the type test summary case Peter added),
> i.e. test what happens if I adjust the summary info.
>

So I'd like to be a bit more clear here that I think this (roundtrippable)
really should be the end goal, but it's certainly not my wheelhouse so I
might be barking up the wrong tree, for sure.

It's my recollection that there are test cases for ThinLTO that need to run
compile steps to generate inputs with summaries to exercise the code they
need to test - is that correct? That seems to be a symptom of this
information being missing (perhaps optionally) from the textual IR input.

Ah, I've got a better example: r216025 (Duncan's implementation of use-list
order preservation in the IR).

> But other questions need to be worked through first - i.e. what happens if
> the user hand changes the textual IR but not the textual summary - how do
> we detect that the summary needs to be rebuilt vs read in and used as is...
>

For textual IR when the IR is present, probably verify it & bail out if it
doesn't pass verification.

But do I remember correctly that there's now a way to produce the summary
without the code itself? (to ship to a thin link step without shipping all
the IR) - or is that in an intermediate state for now stripping the debug
info because that's easy, but leaving in the (unnecessary) function/global
IR? Presumably if not today, eventually, it'd be intended to only ship the
summary in that file - in which case there would be no verification
possible or required.

3) describe what steps (& how this patch relates) are planned to get to (2)
>>
>
> Charles had a draft patch to dump out the summary in textual form along
> with the IR from llvm-dis. Peter and Mehdi suggested repurposing/extending
> the YAML summary dumping (which currently only dumps the type test / devirt
> summary, and only under certain options and to stdout), to dump the rest of
> the summary fields and into the llvm-dis output, instead of having an
> additional dumper.
>
> As a first step, to avoid the issue of what to do when reading back in via
> llvm-as, I suggested dumping as LLVM assembly comments (to make it clear it
> is not parsed back in).
>

It seems good to avoid iterating through multiple formats & corresponding
test churn if we can help it with some up-front design discussion.

> My naive thoughts, not being intimately familiar with any of this: Usually
>> bitcode and textual IR support go in together or around the same time, and
>> designed that way from the start (take r211920 for examaple, which added an
>> explicit representation of COMDATs to the IR). This seems to have been an
>> oversight in the implementation of IR summaries (is that an accurate
>> representation/statement?) & now there's an effort to correct that.
>>
>
> Yes
>
>
>>
>> So it seems like that would start with a discussion of what the right
>> end-state would be: What the syntax in textual IR should be, then
>> implementing it. I can understand implementing such a thing in steps - it's
>> perhaps more involved than the COMDAT situation. In that case starting on
>> either side seems fine - implementing the emission first (hidden behind a
>> flag, so as not to break round-tripping in the interim) or the parsing
>> first (no need to hide it behind any flags - manually written examples can
>> be used as input tests).
>>
>
> Yep, and Charles's original email here was meant to discuss the desired
> textual IR dumping format.
>

Cool cool :)

> (& it sounds like there's some partially implemented functionality using a
>> YAML format that was intended to address how some test cases could be
>> written? & this might be a good basis for the syntax - but seems to me like
>> it might be a bit disjointed/out of place in the textual IR format that's
>> not otherwise YAML-based?)
>>
>
> Right, that was Peter and Mehdi's suggestion. It will look a bit different
> than the rest of the textual IR, but maybe that doesn't matter since this
> isn't IR.
>

Not sure I'm following quite what you mean by "this isn't IR" - it seems to
me like it is IR. It's in-memory, in bitcode, but not in the textual format
yet.

 - Dave

>
> Teresa
>
>
>>
>> - Dave
>>
>> On Fri, Jun 2, 2017 at 8:46 AM Charles Saternos via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Hey all,
>>>
>>> Below is the proposed format for the dump of the ThinLTO module summary
>>> in the llvm-dis utility:
>>>
>>> > ../build/bin/llvm-dis t.o && cat t.o.ll
>>> ; ModuleID = '2.o'
>>> source_filename = "2.ll"
>>> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>> target triple = "x86_64-unknown-linux-gnu"
>>>
>>> @X = constant i32 42, section "foo", align 4
>>>
>>> @a = weak alias i32, i32* @X
>>>
>>> define void @afun() {
>>>   %1 = load i32, i32* @a
>>>   ret void
>>> }
>>>
>>> define void @testtest() {
>>>   tail call void @boop()
>>>   ret void
>>> }
>>>
>>> declare void @boop()
>>>
>>> ; Module summary:
>>> ;  testtest (External linkage)
>>> ;    Function (2 instructions)
>>> ;    Calls: boop
>>> ;  X (External linkage)
>>> ;    Global Variable
>>> ;  afun (External linkage)
>>> ;    Function (2 instructions)
>>> ;    Refs:
>>> ;      a
>>> ;  a (Weak any linkage)
>>> ;    Alias (aliasee X)
>>>
>>> I've implemented the above format in the llvm-dis utility, since there
>>> currently isn't really a way of getting ThinLTO summaries in a
>>> human-readable format.
>>>
>>> Let me know what you think of this format, and what information you
>>> think should be added/removed.
>>>
>>> Thanks,
>>> Charles
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
> --
> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
> 408-460-2413 <(408)%20460-2413>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170605/d584e276/attachment.html>