[llvm-dev] [RFC] Add IR level interprocedural outliner for code size.

Tue Jul 25 22:43:21 PDT 2017

Hey Mehdi,
 I already have PGO support implemented. If we favor speed(Os) it only
considers "cold" blocks. If we favor size(Oz) it considers any block that
isn't "hot".

I've also noticed the improved compile time on quite a few of the tests
when running with early outlining enabled.

Thanks,
 River Riddle

On Tue, Jul 25, 2017 at 10:36 PM, Mehdi AMINI <joker.eph at gmail.com> wrote:

>
>
> 2017-07-24 16:14 GMT-07:00 Quentin Colombet via llvm-dev <
> llvm-dev at lists.llvm.org>:
>
>> Hi River,
>>
>> On Jul 24, 2017, at 2:36 PM, River Riddle <riddleriver at gmail.com> wrote:
>>
>> Hi Quentin,
>>  I appreciate the feedback. When I reference the cost of Target Hooks
>> it's mainly for maintainability and cost on a target author. We want to
>> keep the intrusion into target information minimized. The heuristics used
>> for the outliner are the same used by any other IR level pass seeking
>> target information, i.e TTI for the most part. I can see where you are
>> coming from with "having heuristics solely focused on code size do not
>> seem realistic", but I don't agree with that statement.
>>
>>
>> If you only want code size I agree it makes sense, but I believe, even in
>> Oz, we probably don’t want to slow the code by a big factor for a couple
>> bytes. That’s what I wanted to say and what I wanted to point out is that
>> you need to have some kind of model for the performance to avoid those
>> worst cases. Unless we don’t care :).
>>
>
> That's why we have threshold though, don't we?
> Also the IR makes it easy to connect to PGO, which allows to focus the
> outlining on "cold" regions and preserve good performance.
> River: did you consider this already? Having a good integration with PGO
> could make this part of the default optimization pipeline (i.e. having a
> mode where we outline only the knowingly "cold" code).
>
>
>
>
>
>
>
>>
>> I think there is a disconnect on heuristics. The only user tunable
>> parameters are the lower bound parameters(to the cost model), the actual
>> analysis(heuristic calculation) is based upon TTI information.
>>
>>
>> I don’t see how you can get around adding more hooks to know how a
>> specific function prototype is going to be lowered (e.g., i64 needs to be
>> split into two registers, fourth and onward parameters need to be pushed on
>> the stack and so on). Those change the code size benefit.
>>
>
> How is the inliner doing? How are we handling Oz there?
> If we are fine living with approximation for the inliner, why wouldn't the
> same work for an outliner?
>
>
>
>>
>> There are several comparison benchmarks given in the "More detailed
>> performance data" of the original RFC. It includes comparisons to the
>> Machine Outliner when possible(I can't build clang on Linux with Machine
>> Outliner). I welcome any and all discussion on the placement of the
>> outliner in LLVM.
>>
>>
>> My fear with a new framework is that we are going to split the effort for
>> pushing the outliner technology forward and I’d like to avoid that if at
>> all possible.
>>
>
> It isn't clear to me that implementing it at the MachineLevel was the
> right trade-off in the first place.
> I'm not sure a full comparative study was performed and discussed upstream
> at the time where the MachineIR outliner was implemented? If so it wouldn't
> be fair to ask this to River now.
>
> --
> Mehdi
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170725/ed231fc0/attachment.html>