[llvm-dev] RFC: Representing unions in TBAA

Hubert Tong via llvm-dev llvm-dev at lists.llvm.org
Tue Feb 14 05:51:14 PST 2017


On Mon, Feb 13, 2017 at 10:39 PM, Daniel Berlin <dberlin at dberlin.org> wrote:

>
>
> On Mon, Feb 13, 2017 at 10:07 AM, Hubert Tong <
> hubert.reinterpretcast at gmail.com> wrote:
>
>> On Mon, Feb 13, 2017 at 2:23 AM, Daniel Berlin via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>>
>>>> I don't think this fully solves the problem -- you'll also need to fix
>>>> getMostGenericTBAA.  That is, even if you implement the above scheme,
>>>> say you started out with:
>>>>
>>>> union U {
>>>>   int i;
>>>>   float f;
>>>> };
>>>>
>>>> float f(union U *u, int *ii, float *ff, bool c) {
>>>>   if (c) {
>>>>     *ii = 10;
>>>>     *ff = 10.0;
>>>>   } else {
>>>>     u->i = 10;    // S0
>>>>     u->f = 10.0;  // S1
>>>>   }
>>>>   return u->f;
>>>> }
>>>>
>>>> (I presume you're trying to avoid reordering S0 and S1?)
>>>>
>>>> SimplifyCFG or some other such pass may transform f to:
>>>>
>>>> float f(union U *u, int *ii, float *ff, bool c) {
>>>>   int *iptr = c ? ii : &(u->i);
>>>>   int *fptr = c ? ff : &(u->f);
>>>>   *iptr = 10;     // S2
>>>>   *fptr = 10.0;   // S3
>>>>   return u->f;
>>>> }
>>>>
>>>> then getMostGenericTBAA will infer scalar "int" TBAA for S2 and scalar
>>>> "float" TBAA for S3, which will be NoAlias and allow the reordering
>>>> you were trying to avoid.
>>>>
>>>
>>> FWIW, i have to read this in detail, but a few things pop out at me.
>>>
>>> 1. We would like to live in a world where we don't depend on TBAA
>>> overriding BasicAA to get correct answers.  We do now, but don't want to.
>>> Hopefully this proposal does not make that impossible.
>>>
>>> 2.  Literally the only way that GCC ends up getting this right is two
>>> fold:
>>> It only guarantees things about direct access through union.
>>> If you take the address of the union member (like the transform above),
>>> it knows it will get a wrong answer.
>>> So what it does is it finds the type it has to stop at (here, the union)
>>> to keep the TBAA set the same, and makes the transform end there.
>>> So the above would not occur.
>>>
>>>
>>> 3. A suggestion that TBAA follow all possible paths seems .. very slow.
>>>
>>> 4. "The main motivation for this is functional correctness of code
>>> using unions".  I believe you mean "with tbaa and strict-aliasing on".
>>> If not,functional correctness for unions should not be in any way
>>> related to requiring TBAA.
>>>
>>> 5. Unions are among the worst area of the standard in terms of "nobody
>>> has really thought super-hard about the interaction of aliasing and unions
>>> in a way that is coherent".
>>> So when you say things like 'necessary for functional correctness of
>>> unions', just note that this is pretty much wrong.  You probably mean
>>> "necessary for a reasonable interpretation" or something.
>>>
>>> Because we would be *functionally correct* by the standard by destroying
>>> the program  if you ever read the member you didn't set :)
>>>
>> C11 subclause 6.5.2.3 paragraph 3, has in footnote 95:
>> If the member used to read the contents of a union object is not the same
>> as the member last used to store a value in the object, the appropriate
>> part of the object representation of the value is reinterpreted as an
>> object representation in the new type as described in 6.2.6 (a process
>> sometimes called "type punning"). This might be a trap representation.
>>
>> So, the intent is at least that the use of the . operator or the ->
>> operator to access a member of a union would "safely" perform type punning.
>>
>>
> Certainly, if you can quote this, you know this is new to C11 (and newer
> versions of C++).
> :)
>
Yes, this is new to C11; however, the question for us here is whether, and
under what options, we would support such intended use of the language.
As for C++, the safe type punning aspect does not appear to hold, and I am
not aware of committee consensus to introduce it.


>
> It was explicitly *not* true in earlier versions.
>
> They've also slowly cleaned up the aliasing rules, but, honestly, still a
> mess.
>
Yes, I know of at least one instance where colourful metaphors ended up in
the C committee meeting minutes in relation to aliasing-related discussion.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170214/c1fc7e8d/attachment.html>


More information about the llvm-dev mailing list