[llvm-dev] [RFC][ThinLTO] llvm-dis ThinLTO summary dump format

Teresa Johnson via llvm-dev llvm-dev at lists.llvm.org
Wed Jul 19 08:58:34 PDT 2017


On Wed, Jul 19, 2017 at 8:52 AM, David Blaikie <dblaikie at gmail.com> wrote:

>
>
> On Wed, Jul 19, 2017 at 8:43 AM Teresa Johnson <tejohnson at google.com>
> wrote:
>
>> On Wed, Jul 19, 2017 at 8:31 AM, David Blaikie <dblaikie at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Mon, Jul 17, 2017 at 5:18 PM Mehdi AMINI <joker.eph at gmail.com> wrote:
>>>
>>>> 2017-07-17 16:49 GMT-07:00 David Blaikie via llvm-dev <
>>>> llvm-dev at lists.llvm.org>:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Jul 17, 2017 at 6:11 AM Charles Saternos via llvm-dev <
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>
>>>>>> Hey @chandlerc and @dblaikie,
>>>>>>
>>>>>> Any updates on this in relation to "[PATCH] D34080: [ThinLTO] Add
>>>>>> dump-summary command to llvm-lto2 tool"?
>>>>>>
>>>>>
>>>>> Sorry you've kind of got stuck in the middle of this - but I'm still
>>>>> hoping to hear/understand the pushback on implementing this as a first
>>>>> class .ll construct with serialization and deserialization support.
>>>>>
>>>>> I think Peter mentioned he didn't think this was the right path
>>>>> forward in the long term? If that's the case, I'd like to understand
>>>>> that/reach that conclusion for the project now rather than treating this as
>>>>> a stop-gap with some idea that in the future someone might implement full
>>>>> serialization support (when it's been over a year already, and other stop
>>>>> gaps have been implemented (the yaml input support) already).
>>>>>
>>>>
>>>> I'm totally believing we need first class serialization support in .ll,
>>>> and I have a path forward for this (just not a lot of time to dedicate to
>>>> this).
>>>>
>>>
>>> What's the rough expectation of time/complexity for this path forward?
>>>
>>>
>>>> & if a .ll construct with serialization/deserialization is the path
>>>>> forward, understanding the motivation for a something other than going
>>>>> straight for that would be helpful -usually bitcode features come with .ll
>>>>> support from day 1, not a year later. I'm not clear on what would make this
>>>>> feature an exception/more expensive to do this for (& why it would be worth
>>>>> deferring that work, and what/when that work will be motivated/done)
>>>>>
>>>>
>>>>
>>>> We need a debugging tool for summaries ASAP, and the YAML is *already*
>>>> implemented.
>>>>
>>>
>>> I'm not sure I understand why the tradeoff is worthwhile - in terms of
>>> needing to add a new feature (even if it's already implemented) and tests,
>>> then porting those tests to a first-class .ll construct later. Usually
>>> adding .ll formats doesn't seem to be terribly expensive/time intensive.
>>>
>>
>> The main complication I see is defining the behavior when the serialized
>> summary is read back in.
>> 1) Do we trust that it is correct and consistent with the IR and blindly
>> use it? That could cause some issues if someone changes the IR in a .ll
>> file for testing and doesn't realize they need to also update the summary
>> correspondingly.
>>
>
> Generally textual IR is assumed to be bogus and is validated for
> invariants like this.
>

So why have support to read it in from the .ll at all?


>
>
>> 2) Do we always want to build the summary from the IR and check it
>> against the summary read from the .ll file? In that case, what is even the
>> use of building a summary from the serialized form?
>>
>
> Isn't that why the YAML support was added in the first place - for reading
> in summaries for test cases?
>

Peter added it for testing early ThinLTO CFI support since it predated the
summary building support for these features.


>
>
>> 3) If we want to allow tweaks to the summary in the .ll that override
>> what is in the IR, for testing purposes, how does any checking we do in
>>
>
> I'm not sure why we would - what would be tested that way? If the
> invariants of the summary are violated presumably the behavior of LLVM is
> undefined & so there's nothing to test/no expected/defined/usable behavior
> there.
>

So again, why bother having support to serialize it back in at all?


>
>
>> 2) distinguish between the case in 1) (user error) and 3) (intended
>> difference)?
>>
>> This is why I suggested emitting as comments in the .ll file initially
>> (which is useful for debugging purposes, although the YAML works fine for
>> that too), while the above are hashed out.
>>
>
>>
>>>
>>>> Making it available through the llvm-lto tool is a no-brainer to me.
>>>>
>>>> This was *not* an oversight but a deliberate choice to not do this in
>>>> the first place. Because summaries are the first bitcode feature I know of
>>>> that isn't attached in any way to a Module (you can't get to it from a
>>>> Module).
>>>>
>>>
>>> Not sure I quite follow why that difference made the choice/tradeoff
>>> here different (which admittedly is a bit easier to see in retrospect maybe
>>> - now that there's been a need to build serialization and deserialization).
>>> Do you mean it wasn't clear that serialization support was needed when
>>> summaries were first implemented, but it is clear now?
>>>
>>
>> Speaking for myself, it wasn't clear to me that serialization support was
>> needed for anything other than debugging/testing,
>>
>
> The textual IR in its entirety basically exists only for debugging/testing
> - but it's a big 'only'. (as has become apparent in this case with
> experience, by the looks of it)
>

I think there is a big difference - for most of the IR, the information is
not redundant - you simply can't create the same compiler behavior when
reading a .ll. For the summary it is completely redundant as it is
constructed from the IR.

Teresa


> (not a criticism of you/your work - just that this should've been caught
> in code review, I think)
>
>
>> since it is redundant with and computed from the IR, and I wasn't sure
>> emitting into the .ll file was the right way for debugging. Which is why
>> the testing used llvm-bcanalyzer -dump.
>>
>> Teresa
>>
>>
>>
>>>
>>> - Dave
>>>
>>
>>
>>
>> --
>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>> 408-460-2413 <(408)%20460-2413>
>>
>


-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170719/74774b9c/attachment.html>


More information about the llvm-dev mailing list