[llvm-dev] RFC: Resolving TBAA issues

Sat Aug 19 06:29:31 PDT 2017

Daniel, Hal,

I'm trying to figure out what we would need to do in terms of the 
current approach, accepting its limitations, to support unions, member 
arrays, aggregate accesses and type identifiers and what I get looks 
suspiciously simple. Can you please check it out and let me know if I'm 
missing something?

For unions:

TBAA, regardless of a specific approach, cannot guarantee correct 
answers for couples of accesses that involve pointers to union members.

This means that in terms of TBAA two access paths that come through 
unions may overlap only if the most-enclosing unions in these paths are 
the same union.

Furthermore, two accesses with the same base union type overlap if their 
offset ranges overlap. The access types do not matter. Otherwise, they 
are considered not to overlap and with the current approach we can't say 
better.

This sounds like during scanning through type trees in 
TypeBasedAAResult::Aliases() we should just stop at the first union type 
we have run into and see if:
   a) that union is the base type of the other access and
   b) the offset ranges overlap.

No need to traverse through any union members.

Similarly, in MDNode::getMostGenericTBAA() we do not consider nodes that 
represent union-enclosed types. This means the most generic type may be 
an aggregate or a union, in which case we return null node unless 
aggreagte accesses are supported.

To distinct unions from other types we can have a special group for 
them. Alternatively, we can use type identification nodes, see below.

For aggregate accesses:

The comment in TypeBasedAliasAnalysis.cpp suggests:

   // TODO: We need to check if AccessType of TagA encloses AccessType of
   // TagB to support aggregate AccessType. If yes, return true.

Consider these two accesses:

     struct A { int i; };
     struct B { struct A a; } *b;

     struct X { int i; } *x;

     b->a
     x->i

Following the rule in the comment, the accesses are allowed to overlap, 
which is wrong.

Instead, we should check if the access type of one access encloses the 
base type of the other. And we should only check this if neither base 
type encloses the other.

This relies on traversing through types, but again, we shall never 
follow union-enclosed type nodes as for them to potentially overlap we 
require explicit base specification.

For member arrays:

There are no accesses to arrays, only accesses to array elements. In 
terms of TBAA, an access to an array element is no different than an 
access to an object of the element type. And an access to a member array 
element is no different than an access to a member of the element type. 
All this provided offsets remain the same.

The overall array size does not matter because structure members are 
known not to overlap and we never differentiate between union members.

For type identifiers:

We can add a language-specific node to type metadata tuples that 
guarantees uniqueness of type nodes that are not interchangeable in 
terms of TBAA.

This is supposed to resolve issues like this one:
https://bugs.llvm.org/show_bug.cgi?id=8572

Thanks,

--