[LLVMdev] tbaa metadata representation

Daniel Berlin dberlin at dberlin.org
Tue Mar 5 16:15:12 PST 2013


On Mon, Mar 4, 2013 at 11:12 PM, Tran Ma <tranma at cse.unsw.edu.au> wrote:
> Hi all,
>
> A while ago there was a discussion on changing the current "set of trees"
> representation of TBAA metadata to be more expressive, prompted by the need
> to support C structs. Dan Gohman also talked about the issue here:
> http://llvm.org/devmtg/2012-11/Gohman-AliasAnalysis.pdf. It was suggested
> that the trees be replaced by a type DAG then. While working on this
> compiler, I ended up using an undirected graph to represent aliasing
> instead, I believe it might be suitable for TBAA's purposes as well, for the
> following reasons.
>
> * Does the graph need to be acyclic? Consider these struct types:
>
>     struct a { type1 x; type2 y }
>
>     struct b { type2 y; type3 z }
>
>     struct c { type3 z; type1 x }
>
> They form the following alias graph (sorry about the crappy ascii art):
>
>     a --> b --> c -+
>
>     ^______________|
>
> Which won't be representable if we force our tbaa graph to be acyclic.

I don't see why your example forms the above graph, could you explain?

I believe it should form a forest with 3 roots (a, b, c), and the
following edges
a->x
a->y
b->y
b->z
c->z
c->x

I'm not sure how you got to a->b->c->a

>
> * Does it need to be directed? If we use a directed graph, then I suppose
> we'd be relying on "reachability" in both directions to find out if
> something aliases something else:
>
>     +-->  char  <--+
>
>     |              |
>
>    float          int
>
>     ^              ^
>
>     |___ MyClass __|
>
> To answer "does myclass alias char?", we'd need to check (myclass is
> reachable from char || char is reachable from myclass),

Yes, the graphs essentially represent a set/subset relationship, and
you are checking if either is a subset of the other.

>  similar to how two
> things alias if they are ascendant/descendant of each other in the current
> tree representation.

> Since we're going both ways,

There are actually cases you don't need to go both ways.

>  I think it makes sense to
> just reify the reachability edges like this (note the lack of direction, now
> to check if two things alias, we simply ask if there is an edge connecting
> them):
>
>     +---  char  ---+
>
>     |       |      |
>
>   float     |     int
>
>     |___ MyClass __|

But this is wrong, TBAA is in fact, directed.

     struct S { int i; double d; };


access to struct S can alias int or double
access to double cannot alias struct S
(access to int can because of the first member rule, but otherwise can't).

In your undirected graph, you will say that a store to double can
alias struct S.

>
> This can be represented with a (pretty dense) adjacency matrix and queried
> in constant time, so I thought it might be faster.
>
> From what I can gather without delving deep into the codebase, it's easiest
> right now to change from the "set of trees" representation to a directed
> graph, but I don't think changing to an undirected graph would be much
> harder (please correct me if I'm wrong). I'd like to try and tackle the
> implementation as well, if that's ok.
>
> Thoughts?
>
> Thanks!
>
> --
> Tran.
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>



More information about the llvm-dev mailing list