[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Wed Jun 7 09:44:43 PDT 2017


On Tue, Jun 6, 2017 at 2:21 PM Mehdi AMINI <joker.eph at gmail.com> wrote:

> 2017-06-06 13:38 GMT-07:00 David Blaikie <dblaikie at gmail.com>:
>
>>
>>
>> On Tue, Jun 6, 2017 at 1:26 PM Mehdi AMINI <joker.eph at gmail.com> wrote:
>>
>>> 2017-06-05 14:27 GMT-07:00 David Blaikie via llvm-dev <
>>> llvm-dev at lists.llvm.org>:
>>>
>>>> I know there's been a bunch of discussion here already, but I was
>>>> wondering if perhaps someone (probably Teresa? Peter?) could:
>>>>
>>>> 1) summarize the current state
>>>> 2) describe the end-goal
>>>> 3) describe what steps (& how this patch relates) are planned to get to
>>>> (2)
>>>>
>>>> My naive thoughts, not being intimately familiar with any of this:
>>>> Usually bitcode and textual IR support go in together or around the same
>>>> time, and designed that way from the start (take r211920 for examaple,
>>>> which added an explicit representation of COMDATs to the IR). This seems to
>>>> have been an oversight in the implementation of IR summaries (is that an
>>>> accurate representation/statement?)
>>>>
>>>
>>> More or less: it was not an oversight.
>>> The summaries are not really part of the IR, it is more like an
>>> "analysis result" that is serialized. It can always be recomputed from the
>>> IR. This aspect makes it quite "special", it is the only analysis result
>>> that I know of that we serialize.
>>>
>>
>> The use list work seems pretty similar in some ways (granted, can't be
>> recomputed to match, hence the desire to serialize it for test case
>> implementation).
>>
>
> I see use-list as a leaky implementation detail of the IR that we
> serialized because it impact the processing of the IR.
>
> Summaries are more like serializing the CFG for example.
>
>
>> But it looks like the same is true here to a degree - there are test
>> cases that exercise the summary handling, so they want summaries for input
>> (for now, I think, I've seen test cases that run another LLVM tool to
>> insert/create a summary to then feed that back in for a test), or to test
>> that the resulting summary is correct.
>>
>
> We have cases were we want summaries as an input and check a combined
> summary as an output, and for these having the YAML representation will be
> useful (we didn't have it before).
>

What I'm suggesting is that this is an (optional) IR feature as much as any
other - so it seems slightly odd that it'd be YAML rather than something
that looked more like the rest of the IR. Though I'm not outright opposed
to YAML here - just want to make sure this information is being treated as
a first class IR construct (as much as use order, comdats, etc are for
rough examples)

Can summaries be standalone? I thought they could (that'd be ideal for the
>> distributed situation - only the summary needs to go to the 'thin link'
>> step, I think? (currently maybe only the debug info is stripped for that -
>> but ideally other unused IR wouldn't be shipped there as well, I would
>> think)
>>
>
> Yes conceptually they can be standalone.
>

This seems to provide the strongest/clear motivation for having summaries
as a first class (though optional) IR construct.

& now there's an effort to correct that.
>>>>
>>>
>>> The main motivation here, I believe, is more to help dev to have human
>>> readable/understandable dump for ThinLTO bitcodes. Having to inspect
>>> separately summaries is a pain.
>>>
>>
>> Not sure I quite follow - inspect separately?
>>
>
> llvm-dis does not display summaries today, so you can't just use llvm-dis
> like a "regular" flow.
>
>
>> How are they inspected today?
>>
>
> llvm-bcanalyzer? And now the YAML dump as well.
>
>
>> & also, I think there are test cases that want to/are currently testing
>> summary input but do so somewhat awkwardly by using another tool to produce
>> the summary first. Ideally the test case would have the summary written in
>> to start, I would think, if that's a codepath worth testing?
>>
>
> The IR already contains all the information, so why repeating it?
>

For the same reason that it's relevant to test cases which way it's
encoded, etc (in the same way that the LLVM IR repeats types of uses, for
example - even though they're totally redundant from a "does this have all
the semantic information required) & because it can be standalone.


> This makes the test case harder to maintain, in the vast majority, I
> expect that if a test needs IR then it shouldn't need to include a summary
> as well (and vice-versa).
>

Ah, sorry, I'm not suggesting it should be required - in the same way it's
not required in the bitcode. But if you want a summary in the bitcode when
assembling a .ll file it seems OK To say you write it in the IR, and
equally if there is a summary in the bitcode it seems reasonable that it be
printed in the .ll file by llvm-dis.


> In the majority of test we have we want to check if the importing does
> what it is supposed to do, and if the linkage are correctly adjusted. With
> a YAML (or other) serialization for the summaries this could indeed been
> done purely with summaries, without any IR involved.
>

I'm not sure I understand - you mean for executions of tools that don't
need the rest of the IR, there could be a different/separate tool that
consumes YAML summaries and produces YAML summaries and that would be
tested - but the "consuming a summary in a bitcode file" would not be?

I'm not sure I understand the benefit of this separation and asymmetry with
the bitcode form of the same data.

- Dave


>
> --
> Mehdi
>
>
>
>
>
>
>>
>> - Dave
>>
>>
>>>
>>>  --
>>> Mehdi
>>>
>>> So it seems like that would start with a discussion of what the right
>>>> end-state would be: What the syntax in textual IR should be, then
>>>> implementing it. I can understand implementing such a thing in steps - it's
>>>> perhaps more involved than the COMDAT situation. In that case starting on
>>>> either side seems fine - implementing the emission first (hidden behind a
>>>> flag, so as not to break round-tripping in the interim) or the parsing
>>>> first (no need to hide it behind any flags - manually written examples can
>>>> be used as input tests).
>>>>
>>>> (& it sounds like there's some partially implemented functionality
>>>> using a YAML format that was intended to address how some test cases could
>>>> be written? & this might be a good basis for the syntax - but seems to me
>>>> like it might be a bit disjointed/out of place in the textual IR format
>>>> that's not otherwise YAML-based?)
>>>>
>>>> - Dave
>>>>
>>>> On Fri, Jun 2, 2017 at 8:46 AM Charles Saternos via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> Hey all,
>>>>>
>>>>> Below is the proposed format for the dump of the ThinLTO module
>>>>> summary in the llvm-dis utility:
>>>>>
>>>>> > ../build/bin/llvm-dis t.o && cat t.o.ll
>>>>> ; ModuleID = '2.o'
>>>>> source_filename = "2.ll"
>>>>> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>>>> target triple = "x86_64-unknown-linux-gnu"
>>>>>
>>>>> @X = constant i32 42, section "foo", align 4
>>>>>
>>>>> @a = weak alias i32, i32* @X
>>>>>
>>>>> define void @afun() {
>>>>>   %1 = load i32, i32* @a
>>>>>   ret void
>>>>> }
>>>>>
>>>>> define void @testtest() {
>>>>>   tail call void @boop()
>>>>>   ret void
>>>>> }
>>>>>
>>>>> declare void @boop()
>>>>>
>>>>> ; Module summary:
>>>>> ;  testtest (External linkage)
>>>>> ;    Function (2 instructions)
>>>>> ;    Calls: boop
>>>>> ;  X (External linkage)
>>>>> ;    Global Variable
>>>>> ;  afun (External linkage)
>>>>> ;    Function (2 instructions)
>>>>> ;    Refs:
>>>>> ;      a
>>>>> ;  a (Weak any linkage)
>>>>> ;    Alias (aliasee X)
>>>>>
>>>>> I've implemented the above format in the llvm-dis utility, since there
>>>>> currently isn't really a way of getting ThinLTO summaries in a
>>>>> human-readable format.
>>>>>
>>>>> Let me know what you think of this format, and what information you
>>>>> think should be added/removed.
>>>>>
>>>>> Thanks,
>>>>> Charles
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170607/5baf3e9b/attachment.html>


More information about the llvm-dev mailing list