[llvm-dev] Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining an unavailable optimization)

David Majnemer via llvm-dev llvm-dev at lists.llvm.org
Fri Sep 25 08:35:54 PDT 2015


On Fri, Sep 25, 2015 at 8:22 AM, David Majnemer <david.majnemer at gmail.com>
wrote:

> FWIW, consider the following:
> $ cat t.c
> extern "C"
> int printf(const char *, ...);
>
> struct S {
>     int x[];
> };
>
> void f(S a) {
>     S b;
>     printf("%p %p\n", &a, &b);
> }
>
> int main() {
>     S s;
>     f(s);
> }
>
> $ clang -O2 t.cpp -o t && ./t
> 0x7fff5e75a8e8 0x7fff5e75a8e8
>
> Perhaps Dan was concerned about zero sized objects?
>

This case is a red herring, we CSE'd the stack allocations for 'a' and 's'
which left us with 'call i32 (i8*, ...)* @printf(i8* getelementptr inbounds
([7 x i8]* @.str, i64 0, i64 0), %struct.S* %b.i, %struct.S* %b.i)'


>
>
> On Thu, Sep 24, 2015 at 6:04 PM, Hans Wennborg via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> I tried your patch on a Clang build to see if it would fire. It
>> reduced the size of a bootstrap with 8500 bytes. Not huge, but it
>> seems like a nice improvement. And maybe it could be made more
>> powerful: not just checking if the address is a param or alloca, but
>> an address based on such values.
>>
>> On Thu, Sep 24, 2015 at 2:55 PM, Philip Reames
>> <listmail at philipreames.com> wrote:
>> > I threw together a patch which implements this (attached.)  If we decide
>> > that this is actually a legal transform, I'm happy to post this for
>> review.
>> >
>> > In addition to the version proposed here, I also implemented a case
>> where a
>> > trivially escaped alloca's address is not equal to any other value.  I
>> > believe both are valid, but we should confirm.
>> >
>> > Philip
>> >
>> >
>> > On 09/24/2015 02:34 PM, Aaron Ballman via llvm-dev wrote:
>> >>
>> >> On Thu, Sep 24, 2015 at 5:15 PM, Hans Wennborg <hans at chromium.org>
>> wrote:
>> >>>
>> >>> On Thu, Sep 24, 2015 at 12:06 PM, Aaron Ballman <
>> aaron at aaronballman.com>
>> >>> wrote:
>> >>>>
>> >>>> On Thu, Sep 24, 2015 at 2:42 PM, Hans Wennborg <hans at chromium.org>
>> >>>> wrote:
>> >>>>>
>> >>>>> I was wondering why LLVM cannot optimize this code (which GCC does
>> >>>>> optimize):
>> >>>>>
>> >>>>>    int f(int *p) { int x; return p == &x; }
>> >>>>>
>> >>>>> it would seem that this must always return 0. (This occurs as a
>> >>>>> self-assignment check in the code I was looking at; I was hoping we
>> >>>>> could fold that check away.)
>> >>>>
>> >>>> This is different than a self-assignment check, is it not?
>> >>>>
>> >>>> blah& operator=(const blah &b) {
>> >>>>    if (&b == this) {}
>> >>>>    // ...
>> >>>> }
>> >>>>
>> >>>> (Because it gets the pointer from the parameter and compares against
>> a
>> >>>> "local" pointer?)
>> >>>>
>> >>>> I just want to make sure that you're not suggesting we should
>> optimize
>> >>>> away self-assignment checks in the general case.
>> >>>
>> >>> Right, I'm not suggesting that :-)
>> >>>
>> >>> The code I looked at went something like this:
>> >>>
>> >>>    struct S {
>> >>>      S& operator=(const S& other) {
>> >>>        if (&other != this)
>> >>>          val = other.val;
>> >>>        return *this;
>> >>>      }
>> >>>      void foo();
>> >>>      int val;
>> >>>    };
>> >>>    void S::foo() {
>> >>>      S tmp;
>> >>>      tmp.val = 42;
>> >>>      *this = tmp; // operator= gets inlined; we should know(?) that
>> &tmp
>> >>> != this
>> >>>    }
>> >>>
>> >>> This is of course a silly example, but with GCC we get:
>> >>>
>> >>>    movl $42, (%rdi)
>> >>>    ret
>> >>>
>> >>> whereas Clang generates:
>> >>>
>> >>>    movl $42, -8(%rsp)
>> >>>    leaq -8(%rsp), %rax
>> >>>    cmpq %rdi, %rax
>> >>>    je .LBB0_2
>> >>>    movl $42, (%rdi)
>> >>>    .LBB0_2:
>> >>>    retq
>> >>>
>> >>> which made me sad.
>> >>
>> >> Ah, yes, this makes perfect sense to me. Thank you for the explanation!
>> >>
>> >> ~Aaron
>> >>
>> >>>
>> >>>>> I'd be interested to hear what those with a stronger understanding
>> of
>> >>>>> the standard than myself think about this, and also if there is any
>> >>>>> example of something that could break because of this optimization.
>> If
>> >>>>> not, I'd like us to optimize it :-)
>> >>>>>
>> >>>>>
>> >>>>> On Thu, Jan 31, 2013 at 4:49 PM, Dan Gohman <dan433584 at gmail.com>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> Author: djg
>> >>>>>> Date: Thu Jan 31 18:49:06 2013
>> >>>>>> New Revision: 174131
>> >>>>>>
>> >>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=174131&view=rev
>> >>>>>> Log:
>> >>>>>> Add a comment explaining an unavailable optimization.
>> >>>>>>
>> >>>>>> Modified:
>> >>>>>>      llvm/trunk/lib/Analysis/InstructionSimplify.cpp
>> >>>>>>
>> >>>>>> Modified: llvm/trunk/lib/Analysis/InstructionSimplify.cpp
>> >>>>>> URL:
>> >>>>>>
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?rev=174131&r1=174130&r2=174131&view=diff
>> >>>>>>
>> >>>>>>
>> ==============================================================================
>> >>>>>> --- llvm/trunk/lib/Analysis/InstructionSimplify.cpp (original)
>> >>>>>> +++ llvm/trunk/lib/Analysis/InstructionSimplify.cpp Thu Jan 31
>> >>>>>> 18:49:06 2013
>> >>>>>> @@ -1688,6 +1688,34 @@ static Value *ExtractEquivalentCondition
>> >>>>>>     return 0;
>> >>>>>>   }
>> >>>>>>
>> >>>>>> +// A significant optimization not implemented here is assuming
>> that
>> >>>>>> alloca
>> >>>>>> +// addresses are not equal to incoming argument values. They don't
>> >>>>>> *alias*,
>> >>>>>> +// as we say, but that doesn't mean they aren't equal, so we take
>> a
>> >>>>>> +// conservative approach.
>> >>>>>> +//
>> >>>>>> +// This is inspired in part by C++11 5.10p1:
>> >>>>>> +//   "Two pointers of the same type compare equal if and only if
>> they
>> >>>>>> are both
>> >>>>>> +//    null, both point to the same function, or both represent the
>> >>>>>> same
>> >>>>>> +//    address."
>> >>>>>> +//
>> >>>>>> +// This is pretty permissive.
>> >>>>>
>> >>>>> Indeed :-/
>> >>>>>
>> >>>>>> +// It's also partly due to C11 6.5.9p6:
>> >>>>>> +//   "Two pointers compare equal if and only if both are null
>> >>>>>> pointers, both are
>> >>>>>> +//    pointers to the same object (including a pointer to an
>> object
>> >>>>>> and a
>> >>>>>> +//    subobject at its beginning) or function, both are pointers
>> to
>> >>>>>> one past the
>> >>>>>> +//    last element of the same array object, or one is a pointer
>> to
>> >>>>>> one past the
>> >>>>>> +//    end of one array object and the other is a pointer to the
>> start
>> >>>>>> of a
>> >>>>>> +//    different array object that happens to immediately follow
>> the
>> >>>>>> ï¬ rst array
>> >>>>>> +//    object in the address space.)
>> >>>>>> +//
>> >>>>>> +// C11's version is more restrictive, however there's no reason
>> why
>> >>>>>> an argument
>> >>>>>> +// couldn't be a one-past-the-end value for a stack object in the
>> >>>>>> caller and be
>> >>>>>> +// equal to the beginning of a stack object in the callee.
>> >>>>>
>> >>>>> This is interesting.
>> >>>>>
>> >>>>> For the one-past-the-end pointer to point into the callee, the stack
>> >>>>> would have to be growing upwards. So this won't happen on X86. Can
>> we
>> >>>>> turn this optimization on for downward-growing-stack targets?
>> >>>>>
>> >>>>> Second, if the stack grows upward, and the function argument does
>> >>>>> point into the callee stack frame, "p" and "&x" could have the same
>> >>>>> contents. So per the "represent the same address" part above, they
>> >>>>> should compare equal? But they're noalias? Are we allowed to write
>> >>>>> through p? It wasn't a pointer to a valid object when we made the
>> >>>>> call, but it became valid in the callee? This is all terrifying.
>> >>>>>
>> >>>>> I suppose one could store the value of &x though, and then use it
>> >>>>> again later, i.e.:
>> >>>>>
>> >>>>>    int *global;
>> >>>>>    int f(int *p) {
>> >>>>>      int x;
>> >>>>>      global = &x;
>> >>>>>      return p == &x;
>> >>>>>    }
>> >>>>>    int g() {
>> >>>>>      f(0);
>> >>>>>      return f(global);
>> >>>>>    }
>> >>>>>
>> >>>>> Is g() guaranteed to return 1 here? Maybe we could claim it's
>> >>>>> implementation dependent? GCC does not seem fold p==&x to 0 here. I
>> >>>>> suppose we could make sure to check whether &x escapes the function?
>> >>>>>
>> >>>>>   - Hans
>> >>
>> >> _______________________________________________
>> >> LLVM Developers mailing list
>> >> llvm-dev at lists.llvm.org
>> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >
>> >
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150925/b4803650/attachment.html>


More information about the llvm-dev mailing list