[llvm-dev] Potential issue with noalias @malloc and @realloc

Daniel Berlin via llvm-dev llvm-dev at lists.llvm.org
Tue Apr 11 18:22:33 PDT 2017


Note: This is a generic problem with any situation where noalias exists but
the pointers are proven equal :)
TBAA, for example, has the same generic issue, we just drop the tbaa
metadata and declare it okay, even though it would have been UB at the
source level.

On Tue, Apr 11, 2017 at 3:32 PM, Sanjoy Das via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hi all,
>
> I think I've spotted a semantic issue with marking @malloc and
> @realloc as noalias.  Say we have the program:
>
> int f() {
>   int* p0 = malloc(sizeof(int));
>   free(p0);
>



>   int* p1 = malloc(sizeof(int));
>   if (!p1) return 20;
>   int value = 0;
>   for (int i = 0; i < 1; i++) {
>     *p1 = 20;
>     value = *p1;
>     if (false)  // "false" is obscured in a way the compiler can't fathom
>       if (p0 == p1)
>         a();
>       else
>         b();
>   }
>   return result;
> }
>
> The program above is well defined, and will always return 20.
>

no it won't, you never initialized result :) :) :)


> However, if we first unswitch loops:
>
> int f() {
>   int* p0 = malloc(size of(int));
>   free(p0);
>   int* p1 = malloc(sizeof(int));
>   if (!p1) return 20;
>   int value = 0;
>   if (p0 == p1) {
>     for (int i = 0; i < 1; i++) {
>       *p1 = 20;
>       value = *p1;
>       if (false)
>         a();
>     }
>   } else {
>     // Other copy of the loop that calls b() in dead code
>   }
>   return result;
> }
>
> and then run GVN::propgateEquality that replaces one use of p1 with p0
> but not the other (for some reason, say the other use was obscured
> behind a function call):
>
> int f() {
>   int* p0 = malloc(size of(int));
>   free(p0);
>   int* p1 = malloc(sizeof(int));
>   if (!p1) return 20;
>   int value = 0;
>   if (p0 == p1) {
>     for (int i = 0; i < 1; i++) {
>       *p0 = 20;     // S0
>       value = *p1;  // L0
>       if (false)
>         a();
>     }
>   } else {
>     // Other copy of the loop that calls b() in dead code
>   }
>   return result;
> }
>
>
> Now we have a problem -- since p0 NoAlias p1 (distinct @malloc()
> calls), we can reorder S0 and L0 (or LICM L0):
>
>
> int f() {
>   int* p0 = malloc(size of(int));
>   free(p0);
>   int* p1 = malloc(sizeof(int));
>   if (!p1) return 20;
>   int value = 0;
>   if (p0 == p1) {
>     for (int i = 0; i < 1; i++) {
>       value = *p1;  // L0
>       *p0 = 20;     // S0
>       if (false)
>         a();
>     }
>   } else {
>     // Other copy of the loop that calls b() in dead code
>   }
>   return result;
> }
>
> and we'll now return garbage if p0 == p1 (likely) and p1 is not null
> (also likely).
>

The free changes p0.

Otherwise, yes, your fundamental problem is that you are playing games with
two different language level objects whose pointers are equal, but it says
they do not alias.
I don't remember who i tried to explain to on irc that this can happen, but
they didn't believe me :)

We have no way of specifying the lifetime has ended, either.
You can make the problem as hard or as simple as you want above, but you
can't solve it in general.
Even if you wanted to, it is in fact, trivial to come up with cases where
basicaa will say no alias for the load/store, but not say noalias at the
propagation point.

Just put enough copies in between that it hits the various depth or
complexity limits and gives up and says mayalias :)

It would require semantic changes to llvm ir to fix this to properly
express object lifetimes that is compatible with all the random babble
standards have written down :)
For now, the only  sane solution IMHO, is to say that no alias implies
pointer inequality, regardless of the standards.  Because the above can
occur in any situation noalias exists but they are allowed to be pointer
equal (as mentioned, it's trivial to make this happen in TBAA).

I say sane because you can't actually do anything else and get it right all
the time, other than disable noalias in pretty much every case.

FWIW outside of not propagating when tbaa sets differ, gcc does the same as
we do .
It definitely believes p0 and p1 do not alias in the example above, and
would have no trouble propagating them.
I can't quite get it to optimize it. but it does set the equality p0 == p1,
and staring at the pass, it just doesn't bother to simplify the expression,
it would, in fact, otherwise propagate.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170411/bd594dbc/attachment.html>


More information about the llvm-dev mailing list