[cfe-dev] [PATCH] Fix for bug 21725: wrong results with union and strict-aliasing

Daniel Berlin dberlin at dberlin.org
Tue Mar 17 15:15:28 PDT 2015


On Tue, Mar 17, 2015 at 2:34 PM, Hal Finkel <hfinkel at anl.gov> wrote:

>
> ------------------------------
>
> *From: *"Daniel Berlin" <dberlin at dberlin.org>
> *To: *"Jeroen Dobbelaere" <jeroen.dobbelaere at gmail.com>
> *Cc: *reviews+D8056+public+7b63cedb34d51bb1 at reviews.llvm.org, "Hal
> Finkel" <hfinkel at anl.gov>, cfe-commits at cs.uiuc.edu, "cfe-dev at cs.uiuc.edu
> Developers" <cfe-dev at cs.uiuc.edu>
> *Sent: *Tuesday, March 17, 2015 4:24:01 PM
> *Subject: *Re: [PATCH] Fix for bug 21725: wrong results with union and
> strict-aliasing
>
>
>
> On Tue, Mar 17, 2015 at 2:03 PM, Jeroen Dobbelaere <
> jeroen.dobbelaere at gmail.com> wrote:
>
>> On Tue, Mar 17, 2015 at 7:16 PM, hfinkel at anl.gov <hfinkel at anl.gov> wrote:
>>
>>> In http://reviews.llvm.org/D8056#141899, @jeroen.dobbelaere wrote:
>>>
>>> > Further investigation shows that
>>> llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp is not ready to handle struct
>>> types where multiple members have the same (0) offset. The clang fix only
>>> makes sense once that issue is resolved.
>>>
>>>
>>> Can you work on this? If not (or even if you can), can you file a bug
>>> report with an example?
>>>
>>>
>>> http://reviews.llvm.org/D8056
>>>
>>> EMAIL PREFERENCES
>>>   http://reviews.llvm.org/settings/panel/emailpreferences/
>>>
>>>
>>>
>> Yes, I am looking into it. I have a version now where 'most things' work,
>> but it seems that the tbaa information that clang produces is not
>> expressive enough when the struct/union has array members. For example:
>>
>> union U03 {
>>   short mS[4];
>>   int mI;
>> };
>>
>> results in:
>>
>> !5 = !{!"omnipotent char", !6, i64 0}
>> !6 = !{!"Simple C/C++ TBAA"}
>> !7 = !{!8, !8, i64 0}
>> !8 = !{!"int", !5, i64 0}
>> !9 = !{!10, !10, i64 0}
>> !10 = !{!"short", !5, i64 0}
>> !40 = !{!41, !8, i64 0}
>> !41 = !{!"_ZTS3U03", !5, i64 0, !8, i64 0}
>>
>> Note that In !41, we loose the information about the 'short' member.
>>
>> I would start here :)
>
> You should make it produce info about all the members.
>
>
> I agree.
>
>
>
>> A function like:
>>
>> void test_u03b(union U03* a, union U03* b)
>> {
>>   a->mS[0]=1;
>>   b->mI=2;
>>
>>   // CHECK: Function: _Z9test_u03b{{.*}}
>>   //FIXME: CHECK: MayAlias: store i32 2, i32* %{{.*}}, align 4, !tbaa
>> !{{.*}} <->   store i16 1, i16* %{{.*}}, align 2, !tbaa !{{.*}}
>> }
>>
>> results in wrong information: an access to mS[0] results in a 'short'
>> (!10) access. The access to mI results in '!40'.
>> Note that for the short '!10' access, we loose the information about the
>> associated union.
>>
>
>
>> Because !41 refers to char '!5' and int '8', but not to short '!10', we
>> cannot detect that there might
>> be overlap between '!40' and '!10'.
>>
>
> RIght, but this is clearly a bug in the info above.
>
>
>> When we have two arrays of different type, the problem becomes even worse
>> as we now have no
>> connection to the union any more from the two accesses...
>>
>
> I went through this on the whiteboard with folks when they designed this,
> and pointed this out - it's pretty easy to get into situations where you
> have two random pieces of memory, and they were part of a union, and you
> don't know this.
>
> You can make them as easy or as hard as you like.
>
> The answer I got was "we're fine with people having to make this explicit".
>
>
> I don't understand what this means. How should they do that?
>

So, his point was the arrays have no connections to the union. This is not
the only case this occurs in.

Let's take the canonical example:


For example

union foo {
int a;
float b;
};

int ihateunions(int *a, float *b)
{
<do a thing with a and b>
}

int passtounion()
{
union foo bar;
ihateunions(&bar.a, &bar.b);

}

Inside ihateunions, you will have no idea that a and b are connected to
unions.

Let's say this example is easy, i mean, they are both at offset 0, so they
can alias, right?

Add two different structs in the union, and pass pointers, we'll still
generate wrong struct-path tbaa for this, because it has no idea that it is
part of a union.
:)
You can make this arbitrarily as complex or as simple as you want (put them
in different translation units, etc).


> The language has a defined set of semantics, and we need to follow them.
>

What do you think should happen in the above? :)
(Last time we went around this on the GCC mailing list, this spawned a very
long 30 message thread on the gcc mailing list, with the TL;DR being
 "nobody really knows, but we're fine with telling people if they want
something sane here, they should make the union access in ihateunions
visible and explicit".  DR reports were filed, no good answer was given :P)




> At the very least, if we don't want to fully represent the TBAA hierarchy
> for some types (I'm not sure why we'd not do this, but...) then we should
> omit the metadata all together.
>


This part isn't about not fully representing, but not knowing what part of
the hierarchy you are really in.

Remember that TBAA struct path is not really TBAA info.
It's access path info.


>  -Hal
>
>
>
>>
>> One issue with getting array accesses right that I see, is that the
>> 'offset' field can suddenly be variable.
>>
>>
> Not really. You can't say where it points, so the range would be
> "everything" anyway.
> You should just say it accesses the entire array (but not anything else).
>
>
>
>> Maybe a good enough (and correct) fallback that we should consider is to
>> make such _union_ accesses alias with everything... (omnipotent char) in
>> stead of mapping it to the basic type.
>>
>
> It is possible, with the sets we have, to get the aliasing right :)
> I know this because GCC does it.
>
> This design is nearly identical to GCC's TBAA and subvariable design :)
>
>
>> Any thoughts ?
>>
>> Jeroen Dobbelaere
>>
>>
>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150317/b91d2ec4/attachment.html>


More information about the cfe-dev mailing list