[PATCH] D31885: Remove TBAA information from LValues representing union members

Fri Apr 14 09:46:39 PDT 2017

dberlin added a comment.

In https://reviews.llvm.org/D31885#727167, @hfinkel wrote:

> I'm not sure this is the right way to do this; I suspect we're lumping together a bunch of different bugs:
>
> 1. vector types need to have tbaa which makes them alias with their element types [to be clear, as vector types are an implementation extension, this is our choice; I believe most users expect this to be true, but I'm certainly open to leaving this as-is (i.e. the vector types and scalar types as independent/non-aliasing)].
> 2. tbaa can't be used for write <-> write queries (only read <-> write queries) because the writes can change the effective type
> 3. our 'struct' path TBAA for unions is broken (and to fix this we need to invert the tree structure, etc. as discussed on the list)
>
>   (1) and (3) are independent bugs. (1) if desired, should just be fixed (the vector type should be a child of the scalar element type in the current representation). (2) needs to be fixed too, is not strictly related to unions, and needs to be fixed in the backend. Doing this does not seem hard (we just need to record a Write Boolean in MemoryLocation and then check them in TypeBasedAAResult (TBAA can't be used if both locations are writes). (3) is what this should address. What we should do here is only generate the scalar TBAA for the union accesses. As I recall, the only reason that the scalar TBAA for union accesses is a problem is because of (2), but that's easy to fix in the backend (and we need to do that anyway).
>
>   @dberlin , do you agree?

I pretty strongly agree that these are different bugs, and at the least, deserving of different patches as they have different tradeoffs/solutions.
Ii think what should probably happen here is that we fix #1 and #3 in separate patches, and discuss #2 a bit until we come to consensus about what should be done.

I'm not sure about the solution to #2, because i thought there were very specific points in time at which the effective type could change.
It seems like it should be possible to generate a correct tree that represents this rather than try to hack up the backend, since the frontend knows when this has happened.
(IE if the effective type has changed from double to int, it should have double and int parents to make sure it conflicts with both.  Though i guess maybe that's impossible in the current rep until we fix it).
I generally think it should be the responsibility of the frontend to generate metadata that is correct for all types of queries, and if we have to extend that, i'd extend it, rather than try to special case it in the backend, *especially* by changing a core structure like MemoryLocation.

In fact, i mostly believe tbaa doesn't belong in memory location, and that it should be looked up separately by the TBAA AA impl since that's the thing that should care about it.  As we've repeatedly said, memory locations and pointers don't really have tbaa info, instructions do.  We're really just trying to pass that along.
This would require changing it from including the tbaa tag to including the instruction the location came from, which may actually be useful for other reasons anyway (we'd be able to look up instruction-specific aa info, in the aa impls that have that info, instead of having to shove it all in memory location as we do now, and the impl tries to recreate apples from applesauce).  The cost of doing this is that if we include the instruction in operator==, we'll end up with more queries in things like memoryssa, uses memorylocation as a discriminator.    IE it expects everything with the same memorylocation to have the same aliasing results, so it cuts down on queries by using it as a key.  This will still work, it just won't do anything useful if the instruction is the discriminator.    This seems solvable by saying two memorylocations are equal if the pointers and sizes are equal, and  instructions are the same kind (read/write/kill) and have equal metadata (IE not just tbaa).  Which is really likely what we want anyway, rather than "pointer, size, tbaa info implies alias equality".  Maybe i'm wrong.
We'd definitely have to benchmark.
(and yes, i realize this is more involved than we may want to do right now, i'd actually be willing to sign up to do it if we come to consensus and the numbers look good).

Repository:
  rL LLVM

https://reviews.llvm.org/D31885