[llvm-dev] Aliasing rules difference between GCC and Clang

Ivan Kosarev via llvm-dev llvm-dev at lists.llvm.org
Thu Jan 17 09:09:22 PST 2019


Hello, Jonas,

 > It seems that when the struct member is an array, the base type
 > becomes the element type. This is simply unhandled still in
 > CodeGenTBAA ("TODO"). My question then is if it would be
 > possible to instead create the DAG nodes !"A" and !"B" (as in
 > the first type DAG above) and use them as the base types and
 > then have the access type to be !"int", with the right offset
 > into the struct node?

Correct, the current default TBAA info is not capable of representing 
accesses to (ranges of) array elements, hence the decay to element-type 
access.

Still, for elements indexed with constants. such as a->i[0] in your 
example, it is indeed possible to generate proper TBAA tags. Such tags 
would be no different from tags generated for accesses to non-array 
fields that reside at the same offset within the same base type. If 
there are real life cases where adding the support for constant-indexed 
accesses would make a difference, then I think that would be an improvement.

 > struct S { int i[2]; long l[2]; };
 > ...
 > An access like s->i[0], would then get a node like
 > and this would give a NoAlias above, right? Did I understand
 > this correctly, or did I miss anything relevant?

Yes, TBAA assumes that accesses to int's and long's do not alias.

 > Would this improvement require the NewStructPathTBAA flag to be
 > on, and if so, what is the outlook on this?

It depends on whether handling of constant-indexed accesses would be 
enough for your use cases.

One of the goals behind the new TBAA is to be able to accurately 
represent various kinds accesses to subobjects. This includes 
non-constant-indexed accesses to array elements. More info on how such 
accesses are supposed to be represented can be found here:

https://reviews.llvm.org/D40975

If that's what you need, then the new TBAA info seems to be the most 
promising option. But please note that it's still a work in progress and 
is not mature enough for production use.

Hope this helps.

Regards,


On 14/01/2019 15:57, Jonas Paulsson wrote:
> CAUTION: This email originated from outside of the organization. Do 
> not click links or open attachments unless you recognize the sender 
> and know the content is safe.  If you suspect potential phishing or 
> spam email, report it to ReportSpam at accesssoftek.com
>
> Hi,
>
> It was a while now since I first asked about TBAA, and I am now getting
> back to this, trying to understand TBAA as it seems that the issue I
> described earlier originally in this thread is still one of many "TODO"s
> in CGExpr.cpp
> (http://lists.llvm.org/pipermail/llvm-dev/2018-September/126290.html).
>
> I would like to help on this to the best of my abilities, but I'm not
> quite sure if there is a current plan of incremental changes, or if it
> would be possible to handle the array member of a struct case
> specifically at this point?
>
> I am tempted to believe that the extension to handle my particular case
> is not too far-fetched, as Eli indicated earlier:
>
> This program:
>
> struct A { int i; };
> struct B { int i; };
>
> void h(struct A *a, struct B *b, int *prim_i) {
>   a->i = 0;
>   b->i = 0;
>   *prim_i = 0;
> }
>
> translates to
>
> %i = getelementptr inbounds %struct.A, %struct.A* %a, i32 0, i32 0
> store i32 0, i32* %i, align 4, !tbaa !2
> %i1 = getelementptr inbounds %struct.B, %struct.B* %b, i32 0, i32 0
> store i32 0, i32* %i1, align 4, !tbaa !7
> store i32 0, i32* %prim_i, align 4, !tbaa !9
>
> with this type DAG:
>
>             !2 = !{!3, !4, i64 0}
>          !3 = !{!"A", !4, i64 0}
>       !4 = !{!"int", !5, i64 0}
>    !5 = !{!"omnipotent char", !6, i64 0}
> !6 = !{!"Simple C/C++ TBAA"}
>             !7 = !{!8, !4, i64 0}
>          !8 = !{!"B", !4, i64 0}
>          !9 = !{!4, !4, i64 0}
>
> Just as one would hope for, TBAA produces a NoAlias result between the
> stores to %A and %B:
>
> NoAlias:    store i32 0, i32* %i1, align 4, !tbaa !7 <-> store i32 0,
> i32* %i, align 4, !tbaa !2
> MayAlias:   store i32 0, i32* %prim_i, align 4, !tbaa !9 <->   store i32
> 0, i32* %i, align 4, !tbaa !2
> MayAlias:   store i32 0, i32* %prim_i, align 4, !tbaa !9 <->   store i32
> 0, i32* %i1, align 4, !tbaa !7
>
> The above indicates that TBAA is capable of handling structs with scalar
> members, but if I change the program to look more like the benchmark, 
> like:
>
> struct A { int i[2]; };
> struct B { int i[2]; };
>
> void h(struct A *a, struct B *b, int *prim_i) {
>   a->i[0] = 0;
>   b->i[0] = 0;
>   *prim_i = 0;
> }
>
> , which translates to
>
> %i = getelementptr inbounds %struct.A, %struct.A* %a, i32 0, i32 0
> %arrayidx = getelementptr inbounds [2 x i32], [2 x i32]* %i, i64 0, i64 0
> store i32 0, i32* %arrayidx, align 4, !tbaa !2
> %i1 = getelementptr inbounds %struct.B, %struct.B* %b, i32 0, i32 0
> %arrayidx2 = getelementptr inbounds [2 x i32], [2 x i32]* %i1, i64 0, 
> i64 0
> store i32 0, i32* %arrayidx2, align 4, !tbaa !2
> store i32 0, i32* %prim_i, align 4, !tbaa !2
>
> , the type DAG is generated as
>
>          !2 = !{!3, !3, i64 0}
>       !3 = !{!"int", !4, i64 0}
>    !4 = !{!"omnipotent char", !5, i64 0}
> !5 = !{!"Simple C/C++ TBAA"}
>
> This makes all the stores alias, which is overly conservative, as
> %struct.A and %struct.B do not alias.
>
> It seems that when the struct member is an array, the base type becomes
> the element type. This is simply unhandled still in CodeGenTBAA
> ("TODO"). My question then is if it would be possible to instead create
> the DAG nodes !"A" and !"B" (as in the first type DAG above) and use
> them as the base types and then have the access type to be !"int", with
> the right offset into the struct node?
>
> A struct like
>
> struct S { int i[2]; long l[2]; };
>
> would then be represented with a base node, something like
>
>          !3 = !{!"S", !4, i64 0, !6, 8}
>       !4 = !{!"int", !5, i64 0}
>    !5 = !{!"omnipotent char", !6, i64 0}
>       !6 = !{!"long", !5, i64 0}
>
> , giving !"int" as base at offset 0 and !"long" as base at offset 8
> (given that i[2] is 8 bytes).
>
> An access like s->i[0], would then get a node like
>
> !2 = !{!3, %4, i64 0}
>
> , and this would give a NoAlias above, right? Did I understand this
> correctly, or did I miss anything relevant?
>
> What method(s) would need to be modified to handle this? Is it feasible
> to do this now, or are there other planned steps that somebody is
> working on?
>
> Would this improvement require the NewStructPathTBAA flag to be on, and
> if so, what is the outlook on this?
>
> @Ivan: For what I can see, you had a list of planned improvements. What
> are your thoughts on this?
>
> Thanks for any help,
>
> Jonas
>



More information about the llvm-dev mailing list