[PATCH] Remove "localize global" optimization

Nick Lewycky nicholas at mxc.ca
Fri Oct 4 00:15:11 PDT 2013


Evan Cheng wrote:
>
> On Oct 3, 2013, at 3:10 PM, Nick Lewycky <nlewycky at google.com
> <mailto:nlewycky at google.com>> wrote:
>
>> On 3 October 2013 15:00, Evan Cheng <evan.cheng at apple.com
>> <mailto:evan.cheng at apple.com>> wrote:
>>
>>
>>     On Oct 2, 2013, at 11:20 AM, Alexey Samsonov <samsonov at google.com
>>     <mailto:samsonov at google.com>> wrote:
>>
>>>
>>>     On Wed, Oct 2, 2013 at 9:52 PM, Bob Wilson <bob.wilson at apple.com
>>>     <mailto:bob.wilson at apple.com>> wrote:
>>>
>>>
>>>         On Oct 2, 2013, at 10:08 AM, Bob Wilson <bob.wilson at apple.com
>>>         <mailto:bob.wilson at apple.com>> wrote:
>>>
>>>>
>>>>         On Oct 2, 2013, at 8:33 AM, Alexey Samsonov
>>>>         <samsonov at google.com <mailto:samsonov at google.com>> wrote:
>>>>
>>>>>
>>>>>         On Tue, Oct 1, 2013 at 10:39 PM, Bob Wilson
>>>>>         <bob.wilson at apple.com <mailto:bob.wilson at apple.com>> wrote:
>>>>>
>>>>>
>>>>>             On Oct 1, 2013, at 10:20 AM, Nick Lewycky
>>>>>             <nlewycky at gmail.com <mailto:nlewycky at gmail.com>> wrote:
>>>>>
>>>>>             >
>>>>>             > The optimization was "important" for some SPEC test,
>>>>>             but I think we're past caring about that.
>>>>>
>>>>>             Not necessarily. Has anyone measured the impact on SPEC
>>>>>             and other benchmarks?
>>>>>
>>>>>
>>>>>         I'm going to submit this anyway.
>>>>
>>>>         That's not very friendly. You may not care about SPEC but
>>>>         that doesn't mean that no one else cares.
>>>>
>>>>>         See Reid's comment in
>>>>>         http://llvm-reviews.chandlerc.com/D1754 - looks like this
>>>>>         optimization has incorrect assumptions about C programs.
>>>>
>>>>         The only comment I see from Reid there is this: "I missed
>>>>         any earlier discussion about this. Don't we want to keep
>>>>         this optimization, even if it's a real corner case that
>>>>         fires once in a blue moon? isLeakCheckerRoot() doesn't check
>>>>         any flags to see if any of the sanitizers are on."
>>>
>>>         OK, I see the problem, and I can't think of a way to fix it.
>>>
>>>
>>>         But, if this hurts SPEC results, we may need to investigate
>>>         alternative ways to get the same results. Since you're the
>>>         one pushing this, it would have been nice if you contributed
>>>         to that effort by at least providing the performance data.
>>>         You could still do that now. Please do.
>>>
>>>
>>>     Yes, the comment didn't appear in Phabricator, only the review
>>>     thread:
>>>
>>>     "The C++ standard sez [basic.start.main]p6: "The function main
>>>     shall not be used (3.2) within a program." The C standard has no
>>>     such language, so this optimization isn't really valid for C.
>>>     IIRC this caused PNaCl some problems."
>>>
>>>     I will run the SPEC benchmarks and measure the effects of this
>>>     change. Sorry for not doing this as a first step.
>>
>>     It's more than just SPEC. We're seeing big regressions in
>>     Trimaran/enc-pc1, VersaBench/8b10b, among others.
>>
>>
>> For enc-pc1, you can see global variables "c1" and "count" are only
>> used in main(), while "ax", "bx", "cx", and "dx" are only used in
>> assemble(). I haven't looked at 8b10b.
>>
>>     And we need to be pragmatic about disabling optimizations. I feel
>>     like you are jumping to conclusion about this transformation being
>>     invalid.
>>
>>
>> The transformation *is* invalid, it is jumping to the conclusion that
>> it can move globals to allocas in main with no
>> evidence/proof/attempt-to-determine whether main is reentrant (or for
>> that matter, the entry point at all, though the latter is assumed all
>> over llvm).
>
> I agree the transformation needs to be fixed. But I don't agree with
> just ripping it out. It seems there hasn't been any attempt in finding a
> real solution.

Sorry!

I have thought about how to do it properly, but since I didn't think it 
would matter to any real user or any real code, the "proper" 
implementation would need to have been trivial in order to be worth 
implementing. If the optimization really is worth preserving, then 
obviously that isn't true.

What we need to optimize this case is an llvm attribute that represents 
"this function is only ever on the call stack once (in a given thread)", 
and let clang mark that on main, and let llvm deduce that where it can 
see the full set of static callees.

We can't just do that in functionattrs because globalopt runs first. 
(Mind you, we have many other missed optz'ns just because we don't do a 
second globalopt run at the end.)

Nick



More information about the llvm-commits mailing list