[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

Mon Jul 17 06:11:18 PDT 2017

Hey @chandlerc and @dblaikie,

Any updates on this in relation to "[PATCH] D34080: [ThinLTO] Add
dump-summary command to llvm-lto2 tool"?

Thanks,
Charles

On Fri, Jun 9, 2017 at 3:58 PM, Charles Saternos <charles.saternos at gmail.com
> wrote:

> OK, I tested the hotness, and it works.
>
> > I'd ask Charles to take it over. I think it just needs a test case and
> an update to the usage message.
>
> Sure - I've added the message and a quick test. The patch is here:
> https://reviews.llvm.org/D34063
>
> Thanks,
> Charles
>
> On Thu, Jun 8, 2017 at 8:01 PM, Peter Collingbourne <peter at pcc.me.uk>
> wrote:
>
>> I'd ask Charles to take it over. I think it just needs a test case and an
>> update to the usage message.
>>
>> Peter
>>
>> On Thu, Jun 8, 2017 at 4:55 PM, Teresa Johnson via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Great! For the hotness, try creating a small test case with a very hot
>>> loop that iterates many times. Let me know if you are still having trouble.
>>> While the llvm-dis serialization is being discussed, I suppose at the very
>>> least this can go in with the rest of the existing YAML summary dumping and
>>> get emitted from llvm-lto2 using the patch Peter attached. Peter - do you
>>> want to add that to llvm-lto2 so that we have a way of dumping the existing
>>> YAML supported summary info to stdout, or would you rather have Charles
>>> take that one over and submit it (probably just needs a test case).
>>>
>>> Teresa
>>>
>>> On Thu, Jun 8, 2017 at 4:16 PM, Charles Saternos <
>>> charles.saternos at gmail.com> wrote:
>>>
>>>> Hey Teresa,
>>>>
>>>> I've updated the YAML to include the names and GUIDs for all
>>>> functions/global vars/aliases. I've also added the hotness info to the
>>>> output, but for some reason, none of my tests when running with FDO gave
>>>> anything besides Unknown. I'll be looking more into this tomorrow.
>>>>
>>>> Here's the current format:
>>>>
>>>> > ../build/bin/llvm-lto2 dump-summary b.o
>>>> ---
>>>> NamedGlobalValueMap:
>>>>   :
>>>>     - GUID:            3762489268811518743
>>>>       Kind:            GlobalVar
>>>>       Linkage:         PrivateLinkage
>>>>       NotEligibleToImport: true
>>>>       Live:            false
>>>>   cold:
>>>>     - GUID:            11668175513417606517
>>>>       Kind:            Function
>>>>       Linkage:         ExternalLinkage
>>>>       NotEligibleToImport: true
>>>>       Live:            false
>>>>       InstCount:       5
>>>>       Calls:
>>>>         - Name:            puts
>>>>           GUID:            8979701042202144121
>>>>           Hotness:         Unknown
>>>>   fib:
>>>>     - GUID:            8667248078361406812
>>>>       Kind:            Function
>>>>       Linkage:         ExternalLinkage
>>>>       NotEligibleToImport: true
>>>>       Live:            false
>>>>       InstCount:       26
>>>>       Calls:
>>>>         - Name:            fib
>>>>           GUID:            8667248078361406812
>>>>           Hotness:         Unknown
>>>>   hot:
>>>>     - GUID:            10177652421713147431
>>>>       Kind:            Function
>>>>       Linkage:         ExternalLinkage
>>>>       NotEligibleToImport: true
>>>>       Live:            false
>>>>       InstCount:       14
>>>>       Calls:
>>>>         - Name:            fib
>>>>           GUID:            8667248078361406812
>>>>           Hotness:         Unknown
>>>>         - Name:            printf
>>>>           GUID:            7383291119112528047
>>>>           Hotness:         Unknown
>>>>   llvm.used:
>>>>     - GUID:            15665353970260777610
>>>>       Kind:            GlobalVar
>>>>       Linkage:         AppendingLinkage
>>>>       NotEligibleToImport: true
>>>>       Live:            true
>>>> TypeIdMap:
>>>> WithGlobalValueDeadStripping: false
>>>> ...
>>>>
>>>> Thanks,
>>>> Charles
>>>>
>>>>
>>>> On Wed, Jun 7, 2017 at 12:38 PM, Teresa Johnson <tejohnson at google.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Wed, Jun 7, 2017 at 8:58 AM, Charles Saternos <
>>>>> charles.saternos at gmail.com> wrote:
>>>>>
>>>>>> Alright, now it outputs YAML in the following format:
>>>>>>
>>>>>> ---
>>>>>> NamedGlobalValueMap:
>>>>>>   X:
>>>>>>     - Kind:            GlobalVar
>>>>>>       Linkage:         ExternalLinkage
>>>>>>       NotEligibleToImport: false
>>>>>>       Live:            false
>>>>>>   a:
>>>>>>     - Kind:            Alias
>>>>>>       Linkage:         WeakAnyLinkage
>>>>>>       NotEligibleToImport: false
>>>>>>       Live:            false
>>>>>>       AliaseeGUID:     1881667236089500162
>>>>>>   afun:
>>>>>>     - Kind:            Function
>>>>>>       Linkage:         ExternalLinkage
>>>>>>       NotEligibleToImport: false
>>>>>>       Live:            false
>>>>>>       InstCount:       2
>>>>>>   testtest:
>>>>>>     - Kind:            Function
>>>>>>       Linkage:         ExternalLinkage
>>>>>>       NotEligibleToImport: false
>>>>>>       Live:            false
>>>>>>       InstCount:       2
>>>>>>       Calls:
>>>>>>         - Function:        14471680721094503013
>>>>>> TypeIdMap:
>>>>>> WithGlobalValueDeadStripping: false
>>>>>> ...
>>>>>>
>>>>>> Any thoughts on the new format?
>>>>>>
>>>>>
>>>>> Thanks, Charles. The main improvement I think we would want is to
>>>>> output value names instead of the GUID. Can you build up a map from GUID ->
>>>>> name ahead of time and use those like you were for your initial patch?
>>>>> Actually, I also think it would be useful to emit both the GUID and the
>>>>> name, since the combined index will eventually only have the GUID, so this
>>>>> would give a mapping to use for at least the visual inspection of the
>>>>> combined index.
>>>>>
>>>>> Also, would be good to see an example with FDO, to make sure the
>>>>> hotness info of the calls is emitted.
>>>>>
>>>>> Teresa
>>>>>
>>>>>
>>>>>> Thanks,
>>>>>> Charles
>>>>>>
>>>>>> On Tue, Jun 6, 2017 at 5:21 PM, Mehdi AMINI <joker.eph at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2017-06-06 13:38 GMT-07:00 David Blaikie <dblaikie at gmail.com>:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jun 6, 2017 at 1:26 PM Mehdi AMINI <joker.eph at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> 2017-06-05 14:27 GMT-07:00 David Blaikie via llvm-dev <
>>>>>>>>> llvm-dev at lists.llvm.org>:
>>>>>>>>>
>>>>>>>>>> I know there's been a bunch of discussion here already, but I was
>>>>>>>>>> wondering if perhaps someone (probably Teresa? Peter?) could:
>>>>>>>>>>
>>>>>>>>>> 1) summarize the current state
>>>>>>>>>> 2) describe the end-goal
>>>>>>>>>> 3) describe what steps (& how this patch relates) are planned to
>>>>>>>>>> get to (2)
>>>>>>>>>>
>>>>>>>>>> My naive thoughts, not being intimately familiar with any of
>>>>>>>>>> this: Usually bitcode and textual IR support go in together or around the
>>>>>>>>>> same time, and designed that way from the start (take r211920 for examaple,
>>>>>>>>>> which added an explicit representation of COMDATs to the IR). This seems to
>>>>>>>>>> have been an oversight in the implementation of IR summaries (is that an
>>>>>>>>>> accurate representation/statement?)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> More or less: it was not an oversight.
>>>>>>>>> The summaries are not really part of the IR, it is more like an
>>>>>>>>> "analysis result" that is serialized. It can always be recomputed from the
>>>>>>>>> IR. This aspect makes it quite "special", it is the only analysis result
>>>>>>>>> that I know of that we serialize.
>>>>>>>>>
>>>>>>>>
>>>>>>>> The use list work seems pretty similar in some ways (granted, can't
>>>>>>>> be recomputed to match, hence the desire to serialize it for test case
>>>>>>>> implementation).
>>>>>>>>
>>>>>>>
>>>>>>> I see use-list as a leaky implementation detail of the IR that we
>>>>>>> serialized because it impact the processing of the IR.
>>>>>>>
>>>>>>> Summaries are more like serializing the CFG for example.
>>>>>>>
>>>>>>>
>>>>>>>> But it looks like the same is true here to a degree - there are
>>>>>>>> test cases that exercise the summary handling, so they want summaries for
>>>>>>>> input (for now, I think, I've seen test cases that run another LLVM tool to
>>>>>>>> insert/create a summary to then feed that back in for a test), or to test
>>>>>>>> that the resulting summary is correct.
>>>>>>>>
>>>>>>>
>>>>>>> We have cases were we want summaries as an input and check a
>>>>>>> combined summary as an output, and for these having the YAML representation
>>>>>>> will be useful (we didn't have it before).
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Can summaries be standalone? I thought they could (that'd be ideal
>>>>>>>> for the distributed situation - only the summary needs to go to the 'thin
>>>>>>>> link' step, I think? (currently maybe only the debug info is stripped for
>>>>>>>> that - but ideally other unused IR wouldn't be shipped there as well, I
>>>>>>>> would think)
>>>>>>>>
>>>>>>>
>>>>>>> Yes conceptually they can be standalone.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> & now there's an effort to correct that.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The main motivation here, I believe, is more to help dev to have
>>>>>>>>> human readable/understandable dump for ThinLTO bitcodes. Having to inspect
>>>>>>>>> separately summaries is a pain.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Not sure I quite follow - inspect separately?
>>>>>>>>
>>>>>>>
>>>>>>> llvm-dis does not display summaries today, so you can't just use
>>>>>>> llvm-dis like a "regular" flow.
>>>>>>>
>>>>>>>
>>>>>>>> How are they inspected today?
>>>>>>>>
>>>>>>>
>>>>>>> llvm-bcanalyzer? And now the YAML dump as well.
>>>>>>>
>>>>>>>
>>>>>>>> & also, I think there are test cases that want to/are currently
>>>>>>>> testing summary input but do so somewhat awkwardly by using another tool to
>>>>>>>> produce the summary first. Ideally the test case would have the summary
>>>>>>>> written in to start, I would think, if that's a codepath worth testing?
>>>>>>>>
>>>>>>>
>>>>>>> The IR already contains all the information, so why repeating it?
>>>>>>> This makes the test case harder to maintain, in the vast majority, I expect
>>>>>>> that if a test needs IR then it shouldn't need to include a summary as well
>>>>>>> (and vice-versa).
>>>>>>>
>>>>>>> In the majority of test we have we want to check if the importing
>>>>>>> does what it is supposed to do, and if the linkage are correctly adjusted.
>>>>>>> With a YAML (or other) serialization for the summaries this could indeed
>>>>>>> been done purely with summaries, without any IR involved.
>>>>>>>
>>>>>>> --
>>>>>>> Mehdi
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> - Dave
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>  --
>>>>>>>>> Mehdi
>>>>>>>>>
>>>>>>>>> So it seems like that would start with a discussion of what the
>>>>>>>>>> right end-state would be: What the syntax in textual IR should be, then
>>>>>>>>>> implementing it. I can understand implementing such a thing in steps - it's
>>>>>>>>>> perhaps more involved than the COMDAT situation. In that case starting on
>>>>>>>>>> either side seems fine - implementing the emission first (hidden behind a
>>>>>>>>>> flag, so as not to break round-tripping in the interim) or the parsing
>>>>>>>>>> first (no need to hide it behind any flags - manually written examples can
>>>>>>>>>> be used as input tests).
>>>>>>>>>>
>>>>>>>>>> (& it sounds like there's some partially implemented
>>>>>>>>>> functionality using a YAML format that was intended to address how some
>>>>>>>>>> test cases could be written? & this might be a good basis for the syntax -
>>>>>>>>>> but seems to me like it might be a bit disjointed/out of place in the
>>>>>>>>>> textual IR format that's not otherwise YAML-based?)
>>>>>>>>>>
>>>>>>>>>> - Dave
>>>>>>>>>>
>>>>>>>>>> On Fri, Jun 2, 2017 at 8:46 AM Charles Saternos via llvm-dev <
>>>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hey all,
>>>>>>>>>>>
>>>>>>>>>>> Below is the proposed format for the dump of the ThinLTO module
>>>>>>>>>>> summary in the llvm-dis utility:
>>>>>>>>>>>
>>>>>>>>>>> > ../build/bin/llvm-dis t.o && cat t.o.ll
>>>>>>>>>>> ; ModuleID = '2.o'
>>>>>>>>>>> source_filename = "2.ll"
>>>>>>>>>>> target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
>>>>>>>>>>> target triple = "x86_64-unknown-linux-gnu"
>>>>>>>>>>>
>>>>>>>>>>> @X = constant i32 42, section "foo", align 4
>>>>>>>>>>>
>>>>>>>>>>> @a = weak alias i32, i32* @X
>>>>>>>>>>>
>>>>>>>>>>> define void @afun() {
>>>>>>>>>>>   %1 = load i32, i32* @a
>>>>>>>>>>>   ret void
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> define void @testtest() {
>>>>>>>>>>>   tail call void @boop()
>>>>>>>>>>>   ret void
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> declare void @boop()
>>>>>>>>>>>
>>>>>>>>>>> ; Module summary:
>>>>>>>>>>> ;  testtest (External linkage)
>>>>>>>>>>> ;    Function (2 instructions)
>>>>>>>>>>> ;    Calls: boop
>>>>>>>>>>> ;  X (External linkage)
>>>>>>>>>>> ;    Global Variable
>>>>>>>>>>> ;  afun (External linkage)
>>>>>>>>>>> ;    Function (2 instructions)
>>>>>>>>>>> ;    Refs:
>>>>>>>>>>> ;      a
>>>>>>>>>>> ;  a (Weak any linkage)
>>>>>>>>>>> ;    Alias (aliasee X)
>>>>>>>>>>>
>>>>>>>>>>> I've implemented the above format in the llvm-dis utility, since
>>>>>>>>>>> there currently isn't really a way of getting ThinLTO summaries in a
>>>>>>>>>>> human-readable format.
>>>>>>>>>>>
>>>>>>>>>>> Let me know what you think of this format, and what information
>>>>>>>>>>> you think should be added/removed.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Charles
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>>>>> 408-460-2413 <(408)%20460-2413>
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>>> 408-460-2413 <(408)%20460-2413>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
>>
>> --
>> --
>> Peter
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170717/5de3e17e/attachment-0001.html>