[PATCH] gold, libLTO: Add new flags to support bit set lowering.
Duncan P. N. Exon Smith
dexonsmith at apple.com
Wed Mar 18 14:26:54 PDT 2015
> On 2015-Mar-18, at 13:17, Peter Collingbourne <peter at pcc.me.uk> wrote:
>
> On Wed, Mar 18, 2015 at 12:46:48PM -0700, Duncan P. N. Exon Smith wrote:
>>
>>> On 2015-Mar-18, at 12:08, Peter Collingbourne <peter at pcc.me.uk> wrote:
>>>
>>> On Wed, Mar 18, 2015 at 09:16:54AM -0400, Rafael EspĂndola wrote:
>>>>> I don't think this should be a problem; there is no user-visible behaviour
>>>>> change when using the supported way of enabling control flow integrity.
>>>>>
>>>>> The only supported way to enable CFI for the user to supply -fsanitize=cfi*
>>>>> at link time as well as at compile time. (Admittedly this should be
>>>>> documented better.) The LowerBitSets pass only has an effect on the module
>>>>> if -fsanitize=cfi* was passed as compile time. And the related change
>>>>> http://reviews.llvm.org/D8402 modifies the Clang driver to supply the
>>>>> lowerbitsets flag if -fsanitize=cfi* is supplied at link time.
>>>>
>>>> If the pass does nothing if a given flag is not present, why not
>>>> always run the pass and avoid the option?
>>>
>>> I would like a way to run only a couple of passes at link time if CFI is
>>> enabled and optimizations are enabled but LTO is not specifically enabled. In
>>> particular, I've found that running only simplifycfg and globaldce makes a
>>> significant difference in binary size (with the full LTO pipeline, a Chrome
>>> binary is 179424536 bytes, and with simplifycfg+globaldce it is 160340400
>>> bytes). So we need some way to communicate an optimization level to the
>>> linker so that it knows what level to use.
>>>
>>> Duncan and I discussed this on IRC last night, and we agreed that module
>>> flags inserted at compile time would be the best way to communicate this
>>> information. I thought further about how this could work and I came up with
>>> something relatively simple.
>>>
>>> The solution I have in mind is that a module flag named "LTO Opt Level"
>>> will control the opt level. An opt level of 0 runs no optimization passes, a
>>> level of 1 runs only simplifycfg and globaldce and 2 runs the entire LTO pass
>>> pipeline. -flto causes us to set the flag to 2, otherwise -O >= 1 causes us
>>> to set it to 1, otherwise it is 0. When modules have conflicting opt levels,
>>> we pick the maximum.
>>>
>>> I have patches that implement this and I'll upload them later today.
>>
>> I'm still not sure about the high-level approach. I didn't put together
>> that you'd need to send an "LTO Opt Level" through module flags (I just
>> understood the -lowerbitsets part).
>>
>> I think there may be a minefield of semantic issues with using the LTO
>> pipeline when the user hasn't specifically enabled LTO. Can you point
>> me at the discussion on the list about this so I can catch up? (I'm
>> sorry I missed it.)
>
> What discussion there has been was in the proposal thread:
> https://groups.google.com/d/msg/llvm-dev/YzJSG2VGydI/76zJXpv4OuQJ
Thanks for this. Sorry for chasing you in circles.
>
>> For example, what semantics do you expect in the following case?
>>
>> $ clang -O3 -flto a.c
>> $ clang -O1 -fsanitize=cfi b.c
>> $ clang a.o b.o
>>
>> Here, a.c is supposed to be compiled with LTO but b.c isn't. I'm not
>> sure how you would merge the LTO Opt Level in this case (among other
>> problems).
>
> In this case we compile both with LTO and use the maximum opt level, which
> is 2 (2 from a.o and 1 from b.o). This is an approximation of what ought to
> happen, but I reckon it isn't too bad to handle these types of cases poorly,
> since in most cases one would be controlling the build flags for an entire
> project compiled with CFI, so they would normally be consistent.
>
>> IMO, the following flow makes more sense (but maybe this was already
>> discussed?):
>>
>> - If -fno-lto, run -lowerbitsets near the end of the -cc1 opt
>> pipeline. *Do not* enable -flto.
>
> This won't work (unless perhaps if the whole program was in that translation
> unit). The lowerbitsets pass needs whole-program visibility.
Yes, I understand now, after reading the original thread.
IMO, we should continue to always run the -lowerbitsets pass. It's free
if there's no work to do. There's no reason to encode this in the
module or pass any info from `clang` related to this pass.
But I guess your real goal here is to stop running the rest of the
optimizations, and that's the contentious part. Currently clang doesn't
control the LTO pass pipeline, and how/whether to do that isn't clear.
Starting to do that, whether it's via -disable-opt, "LTO Opt Level", or
some other mechanism, isn't really related to lowering bitsets, and
probably needs its own discussion.
More information about the llvm-commits
mailing list