[llvm-dev] [RFC] Adding function attributes to represent codegen optimization level
Mehdi AMINI via llvm-dev
llvm-dev at lists.llvm.org
Fri Apr 6 20:53:12 PDT 2018
Hi,
Le ven. 6 avr. 2018 à 13:56, Peter Collingbourne <peter at pcc.me.uk> a écrit :
> On Thu, Apr 5, 2018 at 8:44 AM, via llvm-dev <llvm-dev at lists.llvm.org>
> wrote:
>
>> On 2018-04-04 22:00, Mehdi AMINI wrote:
>>
>>> Le mar. 3 avr. 2018 à 12:47, via llvm-dev <llvm-dev at lists.llvm.org> a
>>> écrit :
>>>
>>> All,
>>>> A recent commit, D43040/r324557, changed the behavior of the gold
>>>> plugin
>>>> when compiling with LTO. The change now causes the codegen
>>>> optimization
>>>> level to default to CodeGenOpt::Default (i.e., -O2) rather than use
>>>> the
>>>> LTO optimization level. The argument was made that the LTO
>>>> optimization
>>>> level should control the amount of cross-module optimizations done
>>>> by
>>>> LTO, but it should not control the codegen optimization level; that
>>>> should be based off of the optimization level used during the
>>>> initial
>>>> compilation phase (i.e., bitcode generation).
>>>>
>>>
>>> I actually don't understand this clearly.
>>>
>>> Unless we're saying that we would change the IR optimization level
>>> either using the -OX flag during LTO (which is clumsy, because what is
>>> a "cross-module optimization" alone?), why would the `-OX` flag change
>>> the Codegen optimization level when passed to clang without LTO, but
>>> it wouldn't during LTO?
>>>
>>
>> I'm simply stating the argument made by Peter in r324557; this is not my
>> opinion. Personally, I think it seems reasonable to allow the optimization
>> flag used during the link step to control the codegen optimization level.
>> However, this is no longer the case after r324557.
>>
>> FWIW, I would be very much on-board with reverting r324557 and then
>> changing lld to mirror the behavior of the gold plugin, but I don't know if
>> that's the consensus in the community.
>
>
> To answer your question Mehdi, what I mean by "cross-module optimization"
> is simply a series of passes that operates on a module after having linked
> parts of other modules into it, that would result in IPO between modules.
> For example, an inlining pass followed by scalar optimization passes.
>
> The way I think about LTO is that it effectively splits the pass pipeline
> in two, which lets us put cross-module optimizations in the middle.
>
> What this means semantically is that LTO opt level 0 would essentially run
> the two parts of the pipeline one after the other, giving you essentially
> the same binary as not-LTO, but it would allow for LTO-only features such
> as CFI to work. One might have also chosen to compile parts of one's
> program with different optimization levels, and those levels would need to
> be respected by the code generator. For this to work, we must at least use
> the same CG opt level that was used at compile time.
>
> Higher LTO opt levels would result in more passes being run in the middle,
> perhaps at more aggressive settings, which would result in more
> cross-module optimizations. But we still should at least try to approximate
> the optimization level requested for each particular function.
>
> Ideally, we would use the same optimization level that would have been
> used at compile time. Such an optimization level would be communicated via
> an attribute, as proposed here. However, in the absence of that
> information, it does seem reasonable to make a guess about the user intent
> from the LTO opt level. If a user specifies an LTO opt level of 3, it
> probably means that the user cares a lot about performance, so we can guess
> a CG opt level of CodeGenOpt::Aggressive. Otherwise, we can guess a CG opt
> level of CodeGenOpt::Default since this would seem to provide the best
> balance of performance, code size and debuggability.
>
> So this is the direction that I would propose:
> - Remove ability to override CG opt level from LTO API. For now, we can
> infer it from the LTO opt level as mentioned above.
> - Add function attributes for signaling compile-time opt level and start
> moving towards using them in preference to TargetMachine::OptLevel.
> - Remove code for inferring CG opt level from LTO opt level, as it is now
> redundant with the function attribute.
>
> This would seem to get us to a desired state without regressing users who
> might depend on being able to use the aggressive CG opt level from LTO.
>
> Thoughts?
>
That all seems reasonable to me. That said I haven't given much thoughts
about the opt-level through function attributes recently.
>From what I remember, it was hard to figure the implementation when
inlining two functions (O3 -> O2 or vice-versa), and also some part of the
pipeline just can't be split because they operate module-wise.
For instance if O3 includes an extra `globalopt` pass that O2 does not
include, how do you handle this when some functions are marked as O2 and
others as O3?
The only things I could reason about at the time was that O0 really means
no-optimization and it could be translated somehow to opt_none.
Best,
--
Mehdi
>
> Peter
>
> Are we encoding O1/O2/O3 optimization level into function attributes
>>> and trying to honor these during the LTO IR optimization pipeline as
>>> well?
>>>
>>
>> No. The intent of these attributes are to control the codegen pipeline
>> only. Of course this is all based on the assumption that using the
>> optimization level used during bitcode generation should also be used with
>> LTO in the codegen pipeline.
>>
>> I don't have a strong opinion either way. I just want codgen to respect
>> the fact that I specified -O3 during both the bitcode generation and link
>> steps, but that's not the case anymore. :)
>>
>> Chad
>>
>>
>>
>>> Thanks,
>>>
>>> --
>>> Mehdi
>>>
>>> Assuming the argument is reasonable (it make sense to me), I was
>>>> hoping
>>>> to solicit feedback on how to proceed. The suggestion in
>>>> D43040/r324557
>>>> was to add function attributes to represent the compile-time
>>>> optimization level (which also seems reasonable to me).
>>>>
>>>> As a first step, I've put together two patches: 1) an llvm patch
>>>> that
>>>> adds the function attributes to the LLVM IR and 2) a clang patch
>>>> that
>>>> attaches these attributes to each function based on the codegen
>>>> optimization level. I then use the function level attributes to
>>>> "reconstruct" to codegen optimization level used with LTO.
>>>>
>>>> Please understand this is very much a WIP and just a very small step
>>>> towards a final solution.
>>>>
>>>> Here are the patches for reference:
>>>> Clang: D45226
>>>> LLVM: D45225
>>>>
>>>> Regards,
>>>> Chad
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
>
>
> --
> --
> Peter
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180407/79c53de6/attachment.html>
More information about the llvm-dev
mailing list