[llvm-dev] Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining an unavailable optimization)

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Fri Sep 25 15:40:36 PDT 2015


----- Original Message -----
> From: "Philip Reames via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "Hans Wennborg" <hans at chromium.org>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "Dan Gohman" <dan.gohman at gmail.com>
> Sent: Friday, September 25, 2015 3:26:01 PM
> Subject: Re: [llvm-dev] Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining
> an unavailable optimization)
> 
> 
> On 09/24/2015 06:04 PM, Hans Wennborg wrote:
> > I tried your patch on a Clang build to see if it would fire. It
> > reduced the size of a bootstrap with 8500 bytes. Not huge, but it
> > seems like a nice improvement. And maybe it could be made more
> > powerful: not just checking if the address is a param or alloca,
> > but
> > an address based on such values.
> Yeah, I realized after posting that this should be integrated with
> the
> existing noalias handling in the pointer icmp routine.  However,
> let's
> defer that until we decide if this is actually correct to implement.
>  :)

Agreed (in your patch, you'd need to call stripPointerCasts() and GetUnderlyingObjects anyway). This should go in computePointerICmp in lib/Analysis/InstructionSimplify.cpp near where IsAllocDisjoint is defined. The optimization seems valid to be, in that we generally prohibit 'getting lucky' with address comparisons; if you couldn't know the address, then guessing doesn't count.

 -Hal

> >
> > On Thu, Sep 24, 2015 at 2:55 PM, Philip Reames
> > <listmail at philipreames.com> wrote:
> >> I threw together a patch which implements this (attached.)  If we
> >> decide
> >> that this is actually a legal transform, I'm happy to post this
> >> for review.
> >>
> >> In addition to the version proposed here, I also implemented a
> >> case where a
> >> trivially escaped alloca's address is not equal to any other
> >> value.  I
> >> believe both are valid, but we should confirm.
> >>
> >> Philip
> >>
> >>
> >> On 09/24/2015 02:34 PM, Aaron Ballman via llvm-dev wrote:
> >>> On Thu, Sep 24, 2015 at 5:15 PM, Hans Wennborg
> >>> <hans at chromium.org> wrote:
> >>>> On Thu, Sep 24, 2015 at 12:06 PM, Aaron Ballman
> >>>> <aaron at aaronballman.com>
> >>>> wrote:
> >>>>> On Thu, Sep 24, 2015 at 2:42 PM, Hans Wennborg
> >>>>> <hans at chromium.org>
> >>>>> wrote:
> >>>>>> I was wondering why LLVM cannot optimize this code (which GCC
> >>>>>> does
> >>>>>> optimize):
> >>>>>>
> >>>>>>     int f(int *p) { int x; return p == &x; }
> >>>>>>
> >>>>>> it would seem that this must always return 0. (This occurs as
> >>>>>> a
> >>>>>> self-assignment check in the code I was looking at; I was
> >>>>>> hoping we
> >>>>>> could fold that check away.)
> >>>>> This is different than a self-assignment check, is it not?
> >>>>>
> >>>>> blah& operator=(const blah &b) {
> >>>>>     if (&b == this) {}
> >>>>>     // ...
> >>>>> }
> >>>>>
> >>>>> (Because it gets the pointer from the parameter and compares
> >>>>> against a
> >>>>> "local" pointer?)
> >>>>>
> >>>>> I just want to make sure that you're not suggesting we should
> >>>>> optimize
> >>>>> away self-assignment checks in the general case.
> >>>> Right, I'm not suggesting that :-)
> >>>>
> >>>> The code I looked at went something like this:
> >>>>
> >>>>     struct S {
> >>>>       S& operator=(const S& other) {
> >>>>         if (&other != this)
> >>>>           val = other.val;
> >>>>         return *this;
> >>>>       }
> >>>>       void foo();
> >>>>       int val;
> >>>>     };
> >>>>     void S::foo() {
> >>>>       S tmp;
> >>>>       tmp.val = 42;
> >>>>       *this = tmp; // operator= gets inlined; we should know(?)
> >>>>       that &tmp
> >>>> != this
> >>>>     }
> >>>>
> >>>> This is of course a silly example, but with GCC we get:
> >>>>
> >>>>     movl $42, (%rdi)
> >>>>     ret
> >>>>
> >>>> whereas Clang generates:
> >>>>
> >>>>     movl $42, -8(%rsp)
> >>>>     leaq -8(%rsp), %rax
> >>>>     cmpq %rdi, %rax
> >>>>     je .LBB0_2
> >>>>     movl $42, (%rdi)
> >>>>     .LBB0_2:
> >>>>     retq
> >>>>
> >>>> which made me sad.
> >>> Ah, yes, this makes perfect sense to me. Thank you for the
> >>> explanation!
> >>>
> >>> ~Aaron
> >>>
> >>>>>> I'd be interested to hear what those with a stronger
> >>>>>> understanding of
> >>>>>> the standard than myself think about this, and also if there
> >>>>>> is any
> >>>>>> example of something that could break because of this
> >>>>>> optimization. If
> >>>>>> not, I'd like us to optimize it :-)
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Jan 31, 2013 at 4:49 PM, Dan Gohman
> >>>>>> <dan433584 at gmail.com>
> >>>>>> wrote:
> >>>>>>> Author: djg
> >>>>>>> Date: Thu Jan 31 18:49:06 2013
> >>>>>>> New Revision: 174131
> >>>>>>>
> >>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=174131&view=rev
> >>>>>>> Log:
> >>>>>>> Add a comment explaining an unavailable optimization.
> >>>>>>>
> >>>>>>> Modified:
> >>>>>>>       llvm/trunk/lib/Analysis/InstructionSimplify.cpp
> >>>>>>>
> >>>>>>> Modified: llvm/trunk/lib/Analysis/InstructionSimplify.cpp
> >>>>>>> URL:
> >>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?rev=174131&r1=174130&r2=174131&view=diff
> >>>>>>>
> >>>>>>> ==============================================================================
> >>>>>>> --- llvm/trunk/lib/Analysis/InstructionSimplify.cpp
> >>>>>>> (original)
> >>>>>>> +++ llvm/trunk/lib/Analysis/InstructionSimplify.cpp Thu Jan
> >>>>>>> 31
> >>>>>>> 18:49:06 2013
> >>>>>>> @@ -1688,6 +1688,34 @@ static Value
> >>>>>>> *ExtractEquivalentCondition
> >>>>>>>      return 0;
> >>>>>>>    }
> >>>>>>>
> >>>>>>> +// A significant optimization not implemented here is
> >>>>>>> assuming that
> >>>>>>> alloca
> >>>>>>> +// addresses are not equal to incoming argument values. They
> >>>>>>> don't
> >>>>>>> *alias*,
> >>>>>>> +// as we say, but that doesn't mean they aren't equal, so we
> >>>>>>> take a
> >>>>>>> +// conservative approach.
> >>>>>>> +//
> >>>>>>> +// This is inspired in part by C++11 5.10p1:
> >>>>>>> +//   "Two pointers of the same type compare equal if and
> >>>>>>> only if they
> >>>>>>> are both
> >>>>>>> +//    null, both point to the same function, or both
> >>>>>>> represent the
> >>>>>>> same
> >>>>>>> +//    address."
> >>>>>>> +//
> >>>>>>> +// This is pretty permissive.
> >>>>>> Indeed :-/
> >>>>>>
> >>>>>>> +// It's also partly due to C11 6.5.9p6:
> >>>>>>> +//   "Two pointers compare equal if and only if both are
> >>>>>>> null
> >>>>>>> pointers, both are
> >>>>>>> +//    pointers to the same object (including a pointer to an
> >>>>>>> object
> >>>>>>> and a
> >>>>>>> +//    subobject at its beginning) or function, both are
> >>>>>>> pointers to
> >>>>>>> one past the
> >>>>>>> +//    last element of the same array object, or one is a
> >>>>>>> pointer to
> >>>>>>> one past the
> >>>>>>> +//    end of one array object and the other is a pointer to
> >>>>>>> the start
> >>>>>>> of a
> >>>>>>> +//    different array object that happens to immediately
> >>>>>>> follow the
> >>>>>>> ï¬ rst array
> >>>>>>> +//    object in the address space.)
> >>>>>>> +//
> >>>>>>> +// C11's version is more restrictive, however there's no
> >>>>>>> reason why
> >>>>>>> an argument
> >>>>>>> +// couldn't be a one-past-the-end value for a stack object
> >>>>>>> in the
> >>>>>>> caller and be
> >>>>>>> +// equal to the beginning of a stack object in the callee.
> >>>>>> This is interesting.
> >>>>>>
> >>>>>> For the one-past-the-end pointer to point into the callee, the
> >>>>>> stack
> >>>>>> would have to be growing upwards. So this won't happen on X86.
> >>>>>> Can we
> >>>>>> turn this optimization on for downward-growing-stack targets?
> >>>>>>
> >>>>>> Second, if the stack grows upward, and the function argument
> >>>>>> does
> >>>>>> point into the callee stack frame, "p" and "&x" could have the
> >>>>>> same
> >>>>>> contents. So per the "represent the same address" part above,
> >>>>>> they
> >>>>>> should compare equal? But they're noalias? Are we allowed to
> >>>>>> write
> >>>>>> through p? It wasn't a pointer to a valid object when we made
> >>>>>> the
> >>>>>> call, but it became valid in the callee? This is all
> >>>>>> terrifying.
> >>>>>>
> >>>>>> I suppose one could store the value of &x though, and then use
> >>>>>> it
> >>>>>> again later, i.e.:
> >>>>>>
> >>>>>>     int *global;
> >>>>>>     int f(int *p) {
> >>>>>>       int x;
> >>>>>>       global = &x;
> >>>>>>       return p == &x;
> >>>>>>     }
> >>>>>>     int g() {
> >>>>>>       f(0);
> >>>>>>       return f(global);
> >>>>>>     }
> >>>>>>
> >>>>>> Is g() guaranteed to return 1 here? Maybe we could claim it's
> >>>>>> implementation dependent? GCC does not seem fold p==&x to 0
> >>>>>> here. I
> >>>>>> suppose we could make sure to check whether &x escapes the
> >>>>>> function?
> >>>>>>
> >>>>>>    - Hans
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


More information about the llvm-dev mailing list