[llvm-commits] [llvm] r117048 - /llvm/trunk/lib/Analysis/TypeBasedAliasAnalysis.cpp

Thu Oct 21 13:26:50 PDT 2010

On Oct 21, 2010, at 12:40 PM, Duncan Sands wrote:

> Hi Dan, thanks for documenting this.
> 
>> +// The current metadata format is very simple. MDNodes have up to three
>> +// fields, e.g.:
>> +//   !0 = metadata !{ !"name", !1, 0 }
>> +// The first field is an identity field. It can be any MDString which
>> +// uniquely identifies the type. The second field identifies the type's
>> +// parent node in the tree, or is null or omitted for a root node.
> 
> Are you sure a tree is adequate?  A DAG would seem more appropriate.
> For example, if a struct type S has a field of type F, you would
> probably want an arrow from F to S (or alternatively from S to F,
> depending on how you like your arrows).  But that immediately means
> that you may not have a tree, since any type can occur in multiple
> struct types and a struct type usually contains fields of different
> types, though you do get a DAG.

It's an open question. Here's my current perspective.

First, aggregate loads and stores are pretty rare. Next, large
aggregate loads and stores are usually lowered with @llvm.memcpy.
TBAA for @llvm.memcpy will need new constructs, and there's
nothing here preventing this in the future.

Small aggregate loads and stores are often lowered into
non-aggregate parts, and in this case, each of the parts can
get more precise TBAA nodes.

There doesn't appear to be a major need for DAGs. And trees are
simpler.

> 
>> +// If the third field is present, it's an integer which if equal to 1
>> +// indicates that the type is "constant".
> 
> What does "constant" mean?

I added some more text for this.

> 
>> +//
>> +// TODO: The current metadata encoding scheme doesn't support struct
>> +// fields. For example:
>> +//   struct X {
>> +//     double d;
>> +//     int i;
>> +//   };
>> +//   void foo(struct X *x, struct X *y, double *p) {
>> +//     *x = *y;
>> +//     *p = 0.0;
>> +//   }
>> +// Struct X has a double member, so the store to *x can alias the store to *p.
>> +// Currently it's not possible to precisely describe all the things struct X
>> +// aliases, so struct assignments must use conservative TBAA nodes. There's
>> +// no scheme for attaching metadata to @llvm.memcpy yet either.
> 
> If you want to say things like: suppose S is a struct type with a field of type
> int, and T is a different struct type also with a field of type int; then an
> access of an S-int cannot alias a T-int.  Well, I guess that can also be
> represented as a DAG (eg: the DAG consisting of paths in the DAG I described
> above).  To avoid combinatorial explosion, presumably you would only generate
> the nodes (i.e. paths) that actually occur in the program.
> 
> By the way, how does the TBAA metadata behave when merging modules?  After all,
> metadata !17 in one module might represent the same type as metadata !22 in the
> other module (or be an ancestor, etc) - is this supposed to be sorted out using
> the "name" field somehow?

Yes, the name fields are what give nodes their identity, along with the
identities of their parents. In particular, the name of the root node
contributes to the identity of all nodes in its tree. If the two modules have
trees with different root node names, their trees will remain disjoint, even if
they both have leaf nodes named "int", for example.

And actually, there's nothing requiring the names to be strings, so
a front-end could choose to use a tuple including any number of strings,
version numbers, or anything else.

Dan