[llvm-dev] Potential issue with noalias @malloc and @realloc

Sanjoy Das via llvm-dev llvm-dev at lists.llvm.org
Tue Apr 11 18:49:56 PDT 2017


Hi Daniel,

On April 11, 2017 at 6:22:34 PM, Daniel Berlin (dberlin at dberlin.org) wrote:
> Note: This is a generic problem with any situation where noalias exists but
> the pointers are proven equal :)

Yes.

> TBAA, for example, has the same generic issue, we just drop the tbaa
> metadata and declare it okay, even though it would have been UB at the
> source level.

Yes.  I noticed this when talking to Kostya on a thread about TBAA.
However, I wanted to reduce the "dependencies" of this behavior so I
did not mention TBAA here.

> On Tue, Apr 11, 2017 at 3:32 PM, Sanjoy Das via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> > Hi all,
> >
> > I think I've spotted a semantic issue with marking @malloc and
> > @realloc as noalias. Say we have the program:
> >
> > int f() {
> > int* p0 = malloc(sizeof(int));
> > free(p0);
> >
>
>
>
> > int* p1 = malloc(sizeof(int));
> > if (!p1) return 20;
> > int value = 0;
> > for (int i = 0; i < 1; i++) {
> > *p1 = 20;
> > value = *p1;
> > if (false) // "false" is obscured in a way the compiler can't fathom
> > if (p0 == p1)
> > a();
> > else
> > b();
> > }
> > return result;
> > }
> >
> > The program above is well defined, and will always return 20.
> >
>
> no it won't, you never initialized result :) :) :)

Yes!

> > However, if we first unswitch loops:
> >
> > int f() {
> > int* p0 = malloc(size of(int));
> > free(p0);
> > int* p1 = malloc(sizeof(int));
> > if (!p1) return 20;
> > int value = 0;
> > if (p0 == p1) {
> > for (int i = 0; i < 1; i++) {
> > *p1 = 20;
> > value = *p1;
> > if (false)
> > a();
> > }
> > } else {
> > // Other copy of the loop that calls b() in dead code
> > }
> > return result;
> > }
> >
> > and then run GVN::propgateEquality that replaces one use of p1 with p0
> > but not the other (for some reason, say the other use was obscured
> > behind a function call):
> >
> > int f() {
> > int* p0 = malloc(size of(int));
> > free(p0);
> > int* p1 = malloc(sizeof(int));
> > if (!p1) return 20;
> > int value = 0;
> > if (p0 == p1) {
> > for (int i = 0; i < 1; i++) {
> > *p0 = 20; // S0
> > value = *p1; // L0
> > if (false)
> > a();
> > }
> > } else {
> > // Other copy of the loop that calls b() in dead code
> > }
> > return result;
> > }
> >
> >
> > Now we have a problem -- since p0 NoAlias p1 (distinct @malloc()
> > calls), we can reorder S0 and L0 (or LICM L0):
> >
> >
> > int f() {
> > int* p0 = malloc(size of(int));
> > free(p0);
> > int* p1 = malloc(sizeof(int));
> > if (!p1) return 20;
> > int value = 0;
> > if (p0 == p1) {
> > for (int i = 0; i < 1; i++) {
> > value = *p1; // L0
> > *p0 = 20; // S0
> > if (false)
> > a();
> > }
> > } else {
> > // Other copy of the loop that calls b() in dead code
> > }
> > return result;
> > }
> >
> > and we'll now return garbage if p0 == p1 (likely) and p1 is not null
> > (also likely).
> >
>
> The free changes p0.
>
> Otherwise, yes, your fundamental problem is that you are playing games with
> two different language level objects whose pointers are equal, but it says
> they do not alias.
> I don't remember who i tried to explain to on irc that this can happen, but
> they didn't believe me :)
>
> We have no way of specifying the lifetime has ended, either.
> You can make the problem as hard or as simple as you want above, but you
> can't solve it in general.
> Even if you wanted to, it is in fact, trivial to come up with cases where
> basicaa will say no alias for the load/store, but not say noalias at the
> propagation point.
>
> Just put enough copies in between that it hits the various depth or
> complexity limits and gives up and says mayalias :)

As I was telling David Majnemer on IRC, I'm not annoyed by the fact
that there is a bug, but by the fact that there is no good way of
fixing the bug that I can think of.

> It would require semantic changes to llvm ir to fix this to properly
> express object lifetimes that is compatible with all the random babble
> standards have written down :)
> For now, the only sane solution IMHO, is to say that no alias implies
> pointer inequality, regardless of the standards. Because the above can
> occur in any situation noalias exists but they are allowed to be pointer
> equal (as mentioned, it's trivial to make this happen in TBAA).

Just to be clear, you're suggesting that we no longer mark malloc's
return value as noalias?

> I say sane because you can't actually do anything else and get it right all
> the time, other than disable noalias in pretty much every case.
>
> FWIW outside of not propagating when tbaa sets differ, gcc does the same as
> we do .
> It definitely believes p0 and p1 do not alias in the example above, and
> would have no trouble propagating them.
> I can't quite get it to optimize it. but it does set the equality p0 == p1,
> and staring at the pass, it just doesn't bother to simplify the expression,
> it would, in fact, otherwise propagate.

I'll give writing an actual C++ program that demonstrates the problem
this weekend, but unfortunately these things tend to be time
consuming.

-- Sanjoy


More information about the llvm-dev mailing list