[llvm-dev] Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining an unavailable optimization)
Hans Wennborg via llvm-dev
llvm-dev at lists.llvm.org
Mon Sep 28 11:09:35 PDT 2015
On Fri, Sep 25, 2015 at 3:40 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
>> From: "Philip Reames via llvm-dev" <llvm-dev at lists.llvm.org>
>> To: "Hans Wennborg" <hans at chromium.org>
>> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "Dan Gohman" <dan.gohman at gmail.com>
>> Sent: Friday, September 25, 2015 3:26:01 PM
>> Subject: Re: [llvm-dev] Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining
>> an unavailable optimization)
>>
>>
>> On 09/24/2015 06:04 PM, Hans Wennborg wrote:
>> > I tried your patch on a Clang build to see if it would fire. It
>> > reduced the size of a bootstrap with 8500 bytes. Not huge, but it
>> > seems like a nice improvement. And maybe it could be made more
>> > powerful: not just checking if the address is a param or alloca,
>> > but
>> > an address based on such values.
>> Yeah, I realized after posting that this should be integrated with
>> the
>> existing noalias handling in the pointer icmp routine. However,
>> let's
>> defer that until we decide if this is actually correct to implement.
>> :)
>
> Agreed (in your patch, you'd need to call stripPointerCasts() and GetUnderlyingObjects anyway). This should go in computePointerICmp in lib/Analysis/InstructionSimplify.cpp near where IsAllocDisjoint is defined. The optimization seems valid to be, in that we generally prohibit 'getting lucky' with address comparisons; if you couldn't know the address, then guessing doesn't count.
I do think we'll need to be a little bit careful here, though. I think
we can optimize this by acting as-if no one could know the address of
a stack variable, but then we need to be able to do so consistently.
For example, in this code
int f(int *p) {
int x;
bool a = (p == &x);
bool b = equals(p, &x);
}
a and b needs to end up with the same value.
>> > On Thu, Sep 24, 2015 at 2:55 PM, Philip Reames
>> > <listmail at philipreames.com> wrote:
>> >> I threw together a patch which implements this (attached.) If we
>> >> decide
>> >> that this is actually a legal transform, I'm happy to post this
>> >> for review.
>> >>
>> >> In addition to the version proposed here, I also implemented a
>> >> case where a
>> >> trivially escaped alloca's address is not equal to any other
>> >> value. I
>> >> believe both are valid, but we should confirm.
>> >>
>> >> Philip
>> >>
>> >>
>> >> On 09/24/2015 02:34 PM, Aaron Ballman via llvm-dev wrote:
>> >>> On Thu, Sep 24, 2015 at 5:15 PM, Hans Wennborg
>> >>> <hans at chromium.org> wrote:
>> >>>> On Thu, Sep 24, 2015 at 12:06 PM, Aaron Ballman
>> >>>> <aaron at aaronballman.com>
>> >>>> wrote:
>> >>>>> On Thu, Sep 24, 2015 at 2:42 PM, Hans Wennborg
>> >>>>> <hans at chromium.org>
>> >>>>> wrote:
>> >>>>>> I was wondering why LLVM cannot optimize this code (which GCC
>> >>>>>> does
>> >>>>>> optimize):
>> >>>>>>
>> >>>>>> int f(int *p) { int x; return p == &x; }
>> >>>>>>
>> >>>>>> it would seem that this must always return 0. (This occurs as
>> >>>>>> a
>> >>>>>> self-assignment check in the code I was looking at; I was
>> >>>>>> hoping we
>> >>>>>> could fold that check away.)
>> >>>>> This is different than a self-assignment check, is it not?
>> >>>>>
>> >>>>> blah& operator=(const blah &b) {
>> >>>>> if (&b == this) {}
>> >>>>> // ...
>> >>>>> }
>> >>>>>
>> >>>>> (Because it gets the pointer from the parameter and compares
>> >>>>> against a
>> >>>>> "local" pointer?)
>> >>>>>
>> >>>>> I just want to make sure that you're not suggesting we should
>> >>>>> optimize
>> >>>>> away self-assignment checks in the general case.
>> >>>> Right, I'm not suggesting that :-)
>> >>>>
>> >>>> The code I looked at went something like this:
>> >>>>
>> >>>> struct S {
>> >>>> S& operator=(const S& other) {
>> >>>> if (&other != this)
>> >>>> val = other.val;
>> >>>> return *this;
>> >>>> }
>> >>>> void foo();
>> >>>> int val;
>> >>>> };
>> >>>> void S::foo() {
>> >>>> S tmp;
>> >>>> tmp.val = 42;
>> >>>> *this = tmp; // operator= gets inlined; we should know(?)
>> >>>> that &tmp
>> >>>> != this
>> >>>> }
>> >>>>
>> >>>> This is of course a silly example, but with GCC we get:
>> >>>>
>> >>>> movl $42, (%rdi)
>> >>>> ret
>> >>>>
>> >>>> whereas Clang generates:
>> >>>>
>> >>>> movl $42, -8(%rsp)
>> >>>> leaq -8(%rsp), %rax
>> >>>> cmpq %rdi, %rax
>> >>>> je .LBB0_2
>> >>>> movl $42, (%rdi)
>> >>>> .LBB0_2:
>> >>>> retq
>> >>>>
>> >>>> which made me sad.
>> >>> Ah, yes, this makes perfect sense to me. Thank you for the
>> >>> explanation!
>> >>>
>> >>> ~Aaron
>> >>>
>> >>>>>> I'd be interested to hear what those with a stronger
>> >>>>>> understanding of
>> >>>>>> the standard than myself think about this, and also if there
>> >>>>>> is any
>> >>>>>> example of something that could break because of this
>> >>>>>> optimization. If
>> >>>>>> not, I'd like us to optimize it :-)
>> >>>>>>
>> >>>>>>
>> >>>>>> On Thu, Jan 31, 2013 at 4:49 PM, Dan Gohman
>> >>>>>> <dan433584 at gmail.com>
>> >>>>>> wrote:
>> >>>>>>> Author: djg
>> >>>>>>> Date: Thu Jan 31 18:49:06 2013
>> >>>>>>> New Revision: 174131
>> >>>>>>>
>> >>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=174131&view=rev
>> >>>>>>> Log:
>> >>>>>>> Add a comment explaining an unavailable optimization.
>> >>>>>>>
>> >>>>>>> Modified:
>> >>>>>>> llvm/trunk/lib/Analysis/InstructionSimplify.cpp
>> >>>>>>>
>> >>>>>>> Modified: llvm/trunk/lib/Analysis/InstructionSimplify.cpp
>> >>>>>>> URL:
>> >>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?rev=174131&r1=174130&r2=174131&view=diff
>> >>>>>>>
>> >>>>>>> ==============================================================================
>> >>>>>>> --- llvm/trunk/lib/Analysis/InstructionSimplify.cpp
>> >>>>>>> (original)
>> >>>>>>> +++ llvm/trunk/lib/Analysis/InstructionSimplify.cpp Thu Jan
>> >>>>>>> 31
>> >>>>>>> 18:49:06 2013
>> >>>>>>> @@ -1688,6 +1688,34 @@ static Value
>> >>>>>>> *ExtractEquivalentCondition
>> >>>>>>> return 0;
>> >>>>>>> }
>> >>>>>>>
>> >>>>>>> +// A significant optimization not implemented here is
>> >>>>>>> assuming that
>> >>>>>>> alloca
>> >>>>>>> +// addresses are not equal to incoming argument values. They
>> >>>>>>> don't
>> >>>>>>> *alias*,
>> >>>>>>> +// as we say, but that doesn't mean they aren't equal, so we
>> >>>>>>> take a
>> >>>>>>> +// conservative approach.
>> >>>>>>> +//
>> >>>>>>> +// This is inspired in part by C++11 5.10p1:
>> >>>>>>> +// "Two pointers of the same type compare equal if and
>> >>>>>>> only if they
>> >>>>>>> are both
>> >>>>>>> +// null, both point to the same function, or both
>> >>>>>>> represent the
>> >>>>>>> same
>> >>>>>>> +// address."
>> >>>>>>> +//
>> >>>>>>> +// This is pretty permissive.
>> >>>>>> Indeed :-/
>> >>>>>>
>> >>>>>>> +// It's also partly due to C11 6.5.9p6:
>> >>>>>>> +// "Two pointers compare equal if and only if both are
>> >>>>>>> null
>> >>>>>>> pointers, both are
>> >>>>>>> +// pointers to the same object (including a pointer to an
>> >>>>>>> object
>> >>>>>>> and a
>> >>>>>>> +// subobject at its beginning) or function, both are
>> >>>>>>> pointers to
>> >>>>>>> one past the
>> >>>>>>> +// last element of the same array object, or one is a
>> >>>>>>> pointer to
>> >>>>>>> one past the
>> >>>>>>> +// end of one array object and the other is a pointer to
>> >>>>>>> the start
>> >>>>>>> of a
>> >>>>>>> +// different array object that happens to immediately
>> >>>>>>> follow the
>> >>>>>>> ï¬ rst array
>> >>>>>>> +// object in the address space.)
>> >>>>>>> +//
>> >>>>>>> +// C11's version is more restrictive, however there's no
>> >>>>>>> reason why
>> >>>>>>> an argument
>> >>>>>>> +// couldn't be a one-past-the-end value for a stack object
>> >>>>>>> in the
>> >>>>>>> caller and be
>> >>>>>>> +// equal to the beginning of a stack object in the callee.
>> >>>>>> This is interesting.
>> >>>>>>
>> >>>>>> For the one-past-the-end pointer to point into the callee, the
>> >>>>>> stack
>> >>>>>> would have to be growing upwards. So this won't happen on X86.
>> >>>>>> Can we
>> >>>>>> turn this optimization on for downward-growing-stack targets?
>> >>>>>>
>> >>>>>> Second, if the stack grows upward, and the function argument
>> >>>>>> does
>> >>>>>> point into the callee stack frame, "p" and "&x" could have the
>> >>>>>> same
>> >>>>>> contents. So per the "represent the same address" part above,
>> >>>>>> they
>> >>>>>> should compare equal? But they're noalias? Are we allowed to
>> >>>>>> write
>> >>>>>> through p? It wasn't a pointer to a valid object when we made
>> >>>>>> the
>> >>>>>> call, but it became valid in the callee? This is all
>> >>>>>> terrifying.
>> >>>>>>
>> >>>>>> I suppose one could store the value of &x though, and then use
>> >>>>>> it
>> >>>>>> again later, i.e.:
>> >>>>>>
>> >>>>>> int *global;
>> >>>>>> int f(int *p) {
>> >>>>>> int x;
>> >>>>>> global = &x;
>> >>>>>> return p == &x;
>> >>>>>> }
>> >>>>>> int g() {
>> >>>>>> f(0);
>> >>>>>> return f(global);
>> >>>>>> }
>> >>>>>>
>> >>>>>> Is g() guaranteed to return 1 here? Maybe we could claim it's
>> >>>>>> implementation dependent? GCC does not seem fold p==&x to 0
>> >>>>>> here. I
>> >>>>>> suppose we could make sure to check whether &x escapes the
>> >>>>>> function?
>> >>>>>>
>> >>>>>> - Hans
>> >>> _______________________________________________
>> >>> LLVM Developers mailing list
>> >>> llvm-dev at lists.llvm.org
>> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
More information about the llvm-dev
mailing list