[cfe-dev] [PATCH] Fix for bug 21725: wrong results with union and strict-aliasing

Daniel Berlin dberlin at dberlin.org
Tue Mar 17 14:24:01 PDT 2015


On Tue, Mar 17, 2015 at 2:03 PM, Jeroen Dobbelaere <
jeroen.dobbelaere at gmail.com> wrote:

> On Tue, Mar 17, 2015 at 7:16 PM, hfinkel at anl.gov <hfinkel at anl.gov> wrote:
>
>> In http://reviews.llvm.org/D8056#141899, @jeroen.dobbelaere wrote:
>>
>> > Further investigation shows that
>> llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp is not ready to handle struct
>> types where multiple members have the same (0) offset. The clang fix only
>> makes sense once that issue is resolved.
>>
>>
>> Can you work on this? If not (or even if you can), can you file a bug
>> report with an example?
>>
>>
>> http://reviews.llvm.org/D8056
>>
>> EMAIL PREFERENCES
>>   http://reviews.llvm.org/settings/panel/emailpreferences/
>>
>>
>>
> Yes, I am looking into it. I have a version now where 'most things' work,
> but it seems that the tbaa information that clang produces is not
> expressive enough when the struct/union has array members. For example:
>
> union U03 {
>   short mS[4];
>   int mI;
> };
>
> results in:
>
> !5 = !{!"omnipotent char", !6, i64 0}
> !6 = !{!"Simple C/C++ TBAA"}
> !7 = !{!8, !8, i64 0}
> !8 = !{!"int", !5, i64 0}
> !9 = !{!10, !10, i64 0}
> !10 = !{!"short", !5, i64 0}
> !40 = !{!41, !8, i64 0}
> !41 = !{!"_ZTS3U03", !5, i64 0, !8, i64 0}
>
> Note that In !41, we loose the information about the 'short' member.
>
> I would start here :)

You should make it produce info about all the members.


> A function like:
>
> void test_u03b(union U03* a, union U03* b)
> {
>   a->mS[0]=1;
>   b->mI=2;
>
>   // CHECK: Function: _Z9test_u03b{{.*}}
>   //FIXME: CHECK: MayAlias: store i32 2, i32* %{{.*}}, align 4, !tbaa
> !{{.*}} <->   store i16 1, i16* %{{.*}}, align 2, !tbaa !{{.*}}
> }
>
> results in wrong information: an access to mS[0] results in a 'short'
> (!10) access. The access to mI results in '!40'.
> Note that for the short '!10' access, we loose the information about the
> associated union.
>


> Because !41 refers to char '!5' and int '8', but not to short '!10', we
> cannot detect that there might
> be overlap between '!40' and '!10'.
>

RIght, but this is clearly a bug in the info above.


> When we have two arrays of different type, the problem becomes even worse
> as we now have no
> connection to the union any more from the two accesses...
>

I went through this on the whiteboard with folks when they designed this,
and pointed this out - it's pretty easy to get into situations where you
have two random pieces of memory, and they were part of a union, and you
don't know this.

You can make them as easy or as hard as you like.

The answer I got was "we're fine with people having to make this explicit".


>
> One issue with getting array accesses right that I see, is that the
> 'offset' field can suddenly be variable.
>
>
Not really. You can't say where it points, so the range would be
"everything" anyway.
You should just say it accesses the entire array (but not anything else).



> Maybe a good enough (and correct) fallback that we should consider is to
> make such _union_ accesses alias with everything... (omnipotent char) in
> stead of mapping it to the basic type.
>

It is possible, with the sets we have, to get the aliasing right :)
I know this because GCC does it.

This design is nearly identical to GCC's TBAA and subvariable design :)


> Any thoughts ?
>
> Jeroen Dobbelaere
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150317/ce6d2e69/attachment.html>


More information about the cfe-dev mailing list