[LLVMdev] PROPOSAL: struct-access-path aware TBAA (new version)

Xinliang David Li xinliangli at gmail.com
Thu Mar 28 09:39:19 PDT 2013


Nice proposal.

A possibly simplified way of doing TBAA is:
1) decompose an access into a 3-tuples: <BaseType, AccessType, BitOffset>
-- there is no need to remember the full access path.
2) If the access types are incompatible, return noAlias (scalar TBAA rule)
3) if there is no containment relationship between two BaseTypes, return
noAlias
4) Otherwise, align the the enclosed base type into the enclosing type (for
each enclosed instance), and adjust the offset. If there is no overlap,
return noAlias, otherwise return mayAlias.

For instance, "C::b1.a.y"  AND "D::c.b2.a.x

C::b1.a.y --> <C, int, 32>
D::c.b2.a.x --> <D, int, 96>

Base Type C is enclosed in D, so the first access is normalized to <D, int,
32> which does not overlap with <D, int, 96>.

For more complicated cases, such as p->a[i].x.b[j].y, TBAA can choose to
normalize the access via projection before creating the tuple -->
p->a[0].x.b[0].y.    Make sure MustAlias is not wrongly returned for this
case.

Cheers,

David



On Wed, Mar 27, 2013 at 1:11 PM, Manman Ren <mren at apple.com> wrote:

>
> Hello,
>
> After discussions with Daniel, Dan and others, here is an updated proposal
> for struct-access-path aware TBAA.
>
> Given an example
> struct A {
>   int x;
>   int y;
> };
> struct B {
>   A a;
>   int z;
> };
> struct C {
>   B b1;
>   B b2;
>   int *p;
> };
> struct D {
>   C c;
> };
>
> The purpose of struct-path-aware TBAA is to say
> "C::b1.a" will alias with "B::a.x", "C::b1.a" will alias with "B",
> "C::b1.a" does not alias with "D::c.b2.a.x".
> "A::x" does not alias with "A::y"
> We can potentially disambiguate aggregate accesses and scalar accesses.
>
> The current scalar TBAA will say
> "C::b1.a.x" will alias with "D::c.b2.a.x" since both are accesses to int.
> "A::x" will alias with "A::y"
>
> This proposal is about the format of metadata and how to implement alias
> queries.
> --------- Option A
> A> metadata for the type DAG
>    1> the existing scalar nodes
>       !0 = metadata !{metadata !"any pointer", metadata !1}
>       !1 = metadata !{metadata !"omnipotent char", metadata !2}
>       !2 = metadata !{metadata !"Simple C/C++ TBAA"}
>       !4 = metadata !{metadata !"int", metadata !1}
>    2> the struct nodes
>       A struct node has a unique name plus a list of field types
>       For example, struct node for "C" should look like
>       !5 = metadata !{"C", metadata !3, metadata !3, metadata !0}
>       where !3 is the struct node for "B"
>
>    The type DAG for the example:
>    int <------------- A <- B <-- C <-- D
>    int <-------------------- B
>    any pointer <--------------- C
>
> B> We call metadata attached to scalar accesses scalar tags and metadata
>    attached to aggregate accesses aggregate tags.
>    Each tag can be a sequence of field accesses and nodes in the type DAG.
>    For example, the tag for "C::b2.a.x" is
>    !7 := [ "tbaa.path", !C, "C::b2", !B, "B::a", !A, "A::x", !int ]
> ;C::b2.a.x
>    where !C, !B, !A are metadata nodes in the type DAG, "C::b2" "B::a"
> "A::x" are strings.
> C> alias(x, y) = alias_rule(x, y) or alias_rule(y, x), where alias_rule(x,
> y) is
>    if the first element of 'x' does not enclose the first element of 'y'
>      return NoAlias
>    if the last element of 'x' encloses the first element of 'y' via type
> DAG,
>      return Alias
>    Check the first element of 'y', if it is not in 'x', return NoAlias
>    Repeat until we reach the end of either 'x' or 'y'
>      Check the next element of 'y' with the next element of 'x'
>      if not the same, return false.
>
> -------------- Option B
> Option B is to include offset of each field in the type DAG and let tag
> be offset based.
> A> metadata for the type DAG
>    1> the existing scalar nodes
>    2> the struct nodes
>       A struct node has a unique name plus a list of pairs (offset, field
> type)
>       For example, struct node for "C" should look like
>       !5 = metadata !{"C", 0, metadata !3, 12, metadata !3, 24, metadata
> !0}
>       where !3 is the struct node for "B"
>    The type DAG for the example:
>    int <- 0 <- A
>    int <- 4 <- A <- 0 <- B <- 0  <- C <- 0 <- D
>    int <-------------- 8 <- B <- 12 <- C
>    any pointer <--------------- 24 <- C
>
> B> The tag will be (outermost struct type node, offset, size, the last
> type node)
>    For example, the tag for C::b2.a.x will be
>    !7 = [ "tbaa.offset", !C, 12, 4, !int] ;C::b2.a.x
>    The last field of the tag is to handle alias(D::c, A::x), the tags will
> be
>    [!D, 0, 32, !C] and [!A, 0, 4, !int], we only need to check whether !C
> encloses
>    !A.
>
> C> Consider access pair "C::b1.a.x" and "D::c.b2.a.x", the tags are [!C,
> 0, 4, !int]
>    and [!D, 12, 4, !int].
>    To check if they alias, we first check if !D encloses !C. If no, we
> return
>    NoAlias. Since !D does enclose !C in our example, we start tracing from
> !D
>    and follow the edge with the correct offset by
>    adjusting the offset to be relative to the next node. We will reach C
> in the
>    type DAG, now we have a common type node and can compare the (offset,
> size)
>    directly. [!D, 12, 4] will be adjusted to [!C, 12, 4], since it does not
>    overlap with [!C, 0, 4], we say they do not alias.
>
>    alias_rule(x, y) is
>    if the outermost type of 'x' does not enclose the outermost type of 'y',
>      return NoAlias
>    if the last type of 'x' encloses the outermost type of 'y'
>      return Alias
>    Start from the outermost type of 'x'
>    Repeat until we reach the outermost type of 'y'
>      follow the edge with the correct offset in the type DAG
>      adjust the offset to be relative to the next node
>    Compare the adjusted offset, size pair for 'x' against 'y'
>    if they overlap, return Alias
>    Otherwise, return NoAlias
>
> ------------------ Comparison
> The complexity of alias queries: similar
> The space for metadata: option A should require more space because the tag
> is a
>   sequence of nodes and strings.
> I will work on Option B.
>
> ------------------ Holes
> We use BasicAA to perform sanity check. If BasicAA returns MayAlias, TBAA
> will
> kick off with full speed.
>
> Watch out for cases where the incoming arguments have different types but
> they
> do alias. Scalar TBAA has the same issue but we don't see bugs coming up.
> If
> we see bugs coming up with struct-path aware TBAA, we can do less
> aggressive
> TBAA or apply similar tricks as GCC.
>
> Scalar TBAA will say NoAlias for the placement new example under
> test/Analysis/TypeBasedAliasAnalysis/placement-tbaa.ll.
> If we optimize placement-tbaa.ll with -O1 -gvn, BaiscAA will start saying
> MustAlias. This may cause problem.
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130328/dcbf6eb6/attachment.html>


More information about the llvm-dev mailing list