[PATCH] [PR16236] Be conservative when doing TypeBasedAlias Analysis of members of Struct

Daniel Berlin dberlin at dberlin.org
Tue Apr 15 22:26:02 PDT 2014


On Tue, Apr 8, 2014 at 9:14 PM, Karthik Bhat <blitz.opensource at gmail.com>wrote:

> Thanks Dan for the review. I have a few doubts though if you could clarify
> the same.
>
>
> On Wed, Apr 9, 2014 at 12:17 AM, Dan Gohman <dan433584 at gmail.com> wrote:
>
>>
>>
>>
>> On Tue, Apr 8, 2014 at 5:04 AM, Karthik Bhat <kv.bhat at samsung.com> wrote:
>>
>>> Hi majnemer, sunfish, rengolin,
>>>
>>> Hi All,
>>> This patches fixes PR16236 aproblem with TypeBasedAlias Analysis. Below
>>> is my understanding please help me review this patch and if i'm thinking in
>>> the right direction.
>>> The problem in this case seems to be that in case we have an union like -
>>>   union
>>>   {
>>>     short f0;
>>>     int f1;
>>>   } u, *up = &u;
>>>
>>> f0 and f1 may alias each other. But while constructing AliasSet for
>>> these in TypeBasedAlias Analysis during PathAliases we see that both the
>>> MDNodes corresponding to (f0 and f1) are not ancestor of each other and
>>> have a common root and thus conclude that they will Not alias each other.
>>> But in case of union this may not be correct f1 anf f0 may alias each other
>>> as thye operate on same underlining memory location.
>>>
>>
>> LLVM's approach for solving this problem is to do a BasicAA query first,
>> and if BasicAA says MustAlias or PartialAlias, it doesn't do a TBAA query
>> at all. This represents a kind of "best effort": the optimizer will make
>> its best effort to discover whether the type information is unreliable, but
>> if it doesn't discover a problem, then it assumes that the type information
>> is reliable. As such, it's not safe in all cases, but the C standard's own
>> definition of TBAA is not safe in all cases, and this is an attempt to
>> implement the spirit of what the C standard desired while finding a
>> practical balance with correctness.
>>
>> Are you seeing a case where this approach is not sufficient?
>>
>
> Yes Dan in PR16236. http://llvm.org/bugs/show_bug.cgi?id=16236 we have an
> example were the behavior with and without optimization is different.
> Ideally optimization should not change program behavior.  Given the below
> program from the bugID-
>
> int printf(const char *, ...);
>
> union
> {
>   short f0;
>   int f1;
> } u, *up = &u;
>
> int a;
>
> int main ()
> {
>   for (; a <= 0; a++)
>     if ((u.f0 = 1))
>       up->f1 = 0;
>
>   printf ("%d\n", u.f1);
>   return 0;
> }
>
> TypeBasedAlias analysis incorrectly returns NoAlias for u.f0 and up->f1
>


Actually, at least by the C99 rules, this is not incorrect.

Annex J.1 says "The following are unspecified:
...
— The value of a union member other than the last one stored into (6.2.6.1).
"
6.2.6.1 says:
"When a value is stored in a member of an object of union type, the bytes
of the object
representation that do not correspond to that member but do correspond to
other members
take unspecified values."

When you store f0, the value of f1 also becomes unspecified.

Now, as Dan says, compilers essentially do the best they can to make this
"not crazy".

But you can come up with arbitrarily complex examples where most TBAA is
going to fall down and claim noalias, even though BasicAA would say
MustAlias.  A trivial variation on your example :

union
{
  short f0;
  int f1;
} u, *up = &u;


int foo(short *a, int *b) {
// Within this function, *a and *b will be considered disjoint  by most
TBAA implementations that implement "C rules", or things that produce TBAA
sets that implement them.
//  This is not a great example, for various reasons, but you can see where
it is going.

}

int main(void)
{
  foo(&u.f0, &u.f1);
}


Now place foo in a different translation unit.
Now there is no way compiler could know they alias.

GCC is a little more conservative here, and says "if we see you place the
two types in a union, we assume pointers to them could alias".

But if you place it in a different TU, it also gives up.

As Dan says, the C standard's "definition" of aliasing is knowingly
deficient in a large number of ways that it make it impossible for
compilers to use safely in all cases.
They all do the best they can.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140415/2202a3ac/attachment.html>


More information about the llvm-commits mailing list