[LLVMdev] tbaa metadata representation
Daniel Berlin
dberlin at dberlin.org
Tue Mar 5 16:15:12 PST 2013
On Mon, Mar 4, 2013 at 11:12 PM, Tran Ma <tranma at cse.unsw.edu.au> wrote:
> Hi all,
>
> A while ago there was a discussion on changing the current "set of trees"
> representation of TBAA metadata to be more expressive, prompted by the need
> to support C structs. Dan Gohman also talked about the issue here:
> http://llvm.org/devmtg/2012-11/Gohman-AliasAnalysis.pdf. It was suggested
> that the trees be replaced by a type DAG then. While working on this
> compiler, I ended up using an undirected graph to represent aliasing
> instead, I believe it might be suitable for TBAA's purposes as well, for the
> following reasons.
>
> * Does the graph need to be acyclic? Consider these struct types:
>
> struct a { type1 x; type2 y }
>
> struct b { type2 y; type3 z }
>
> struct c { type3 z; type1 x }
>
> They form the following alias graph (sorry about the crappy ascii art):
>
> a --> b --> c -+
>
> ^______________|
>
> Which won't be representable if we force our tbaa graph to be acyclic.
I don't see why your example forms the above graph, could you explain?
I believe it should form a forest with 3 roots (a, b, c), and the
following edges
a->x
a->y
b->y
b->z
c->z
c->x
I'm not sure how you got to a->b->c->a
>
> * Does it need to be directed? If we use a directed graph, then I suppose
> we'd be relying on "reachability" in both directions to find out if
> something aliases something else:
>
> +--> char <--+
>
> | |
>
> float int
>
> ^ ^
>
> |___ MyClass __|
>
> To answer "does myclass alias char?", we'd need to check (myclass is
> reachable from char || char is reachable from myclass),
Yes, the graphs essentially represent a set/subset relationship, and
you are checking if either is a subset of the other.
> similar to how two
> things alias if they are ascendant/descendant of each other in the current
> tree representation.
> Since we're going both ways,
There are actually cases you don't need to go both ways.
> I think it makes sense to
> just reify the reachability edges like this (note the lack of direction, now
> to check if two things alias, we simply ask if there is an edge connecting
> them):
>
> +--- char ---+
>
> | | |
>
> float | int
>
> |___ MyClass __|
But this is wrong, TBAA is in fact, directed.
struct S { int i; double d; };
access to struct S can alias int or double
access to double cannot alias struct S
(access to int can because of the first member rule, but otherwise can't).
In your undirected graph, you will say that a store to double can
alias struct S.
>
> This can be represented with a (pretty dense) adjacency matrix and queried
> in constant time, so I thought it might be faster.
>
> From what I can gather without delving deep into the codebase, it's easiest
> right now to change from the "set of trees" representation to a directed
> graph, but I don't think changing to an undirected graph would be much
> harder (please correct me if I'm wrong). I'd like to try and tackle the
> implementation as well, if that's ok.
>
> Thoughts?
>
> Thanks!
>
> --
> Tran.
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
More information about the llvm-dev
mailing list