[PATCH] D31885: Remove TBAA information from LValues representing union members

Daniel Berlin via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Apr 14 12:47:40 PDT 2017


dberlin added a comment.

In https://reviews.llvm.org/D31885#727519, @dberlin wrote:

> In https://reviews.llvm.org/D31885#727499, @hfinkel wrote:
>
> > In https://reviews.llvm.org/D31885#727371, @efriedma wrote:
> >
> > > In https://reviews.llvm.org/D31885#727167, @hfinkel wrote:
> > >
> > > > I'm not sure this is the right way to do this; I suspect we're lumping together a bunch of different bugs:
> > > >
> > > > 1. vector types need to have tbaa which makes them alias with their element types [to be clear, as vector types are an implementation extension, this is our choice; I believe most users expect this to be true, but I'm certainly open to leaving this as-is (i.e. the vector types and scalar types as independent/non-aliasing)].
> > > > 2. tbaa can't be used for write <-> write queries (only read <-> write queries) because the writes can change the effective type
> > > > 3. our 'struct' path TBAA for unions is broken (and to fix this we need to invert the tree structure, etc. as discussed on the list)
> > >
> > >
> > > See https://bugs.llvm.org/show_bug.cgi?id=28189 for a testcase for (2) for this which doesn't involve unions.
> >
> >
> > Yes, this is what I had in mind. However, we may just want to not handle this at all. The demonstration you provide:
> >
> >   #include <stdio.h>
> >   #include <string.h>
> >   #include <stdlib.h>
> >   int f(int* x, int *y) {
> >     *x = 10;
> >     int z = *y;
> >     *(float*)x = 1.0;
> >     return z;
> >   }
> >   int (*ff)(int*,int*) = f;
> >   int main() {
> >     void* x = malloc(4);
> >     printf("%d\n", ff(x, x));
> >   }
> >   
> >
> > shows that the problem is more than I implied. To support this, we not only need to ignore the TBAA between the two writes (*x and *(float*)x), but also between the float write and the preceding int read. I wonder how much of TBAA we could keep at all and still support this. Thoughts?
>
>
> The standard is just kind of broken here, 
>  It assumes that you can assign effective types at object creation points, and track them for all time.
>  But you can't.  f() could be in a different translation unit, but it still needs to be allowed to assume that the int *and the float *can't possible conflict.  Otherwise, tbaa is useless.  This is even now codified for unions after many years.
>  What is being demonstrated is just another way to achieve the same problem was fixed by requiring union accesses to be explicit, and so i'd say it should have the same resolution:
>  Such an effective type change must be more explicit than "i allocated typeless memory, and so i can do what i want with it".
>  Because we can't *ever* make that work.


In particular, i could have:

foo.c:

  #include <stdio.h>
  #include <string.h>
  #include <stdlib.h>
  int f(int* x, int *y) {
    *x = 10;
    int z = *y;
    *(float*)x = 1.0;
    return z;
  }

bar.c:

  extern int f(int *, int *)
  
  int (*ff)(int*,int*) = f;
  int main() {
    void* x = malloc(4);
    printf("%d\n", ff(x, x));
  }

In foo.c, there is no information you could ever use to tell you the *(float*) is legal or illegal.  That is just a normal function :)

Thus, GCC, et all have taken the position that if you change the effective type, it must be completely visible in all cases.
Anything else would have rules that change when you inline, do LTO, etc
This where the "must make union accesses visible" came from.  I'm not sure what we would do for malloc+memcpy'd memory, but i assume something similar: It must be completely visible that it came from typeless memory, in all cases, and then we could tag it properly.


Repository:
  rL LLVM

https://reviews.llvm.org/D31885





More information about the cfe-commits mailing list