[llvm-commits] [pr12979][patch/rfc] Clear nsw/nuw in gvn

Daniel Berlin dberlin at dberlin.org
Fri Jun 1 18:00:08 PDT 2012


On Fri, Jun 1, 2012 at 8:16 PM, Nuno Lopes <nunoplopes at sapo.pt> wrote:
>> On Fri, Jun 1, 2012 at 6:39 PM, Nuno Lopes <nunoplopes at sapo.pt> wrote:
>>>
>>> Hi,
>>>
>>>> On 30 May 2012 12:16, Rafael Espíndola <rafael.espindola at gmail.com>
>>>> wrote:
>>>>>>
>>>>>> This seems right.  For range metadata you should form the union of the
>>>>>> ranges I guess.
>>>>>
>>>>>
>>>>> Implemented.
>>>>
>>>>
>>>> I have rebased it now that the verifier enforces that the range is in
>>>> a canonical form. I have also fixed corner cases like a merge forming
>>>> the full set.
>>>
>>>
>>>
>>> Although I'm arriving a bit late to the discussion, let me just write
>>> my thoughts on this.
>>> I think that with range metadata we can be more aggressive. For
>>> example, if you have the following:
>>> %a = load i32* %p, !range !1
>>> %b = load i32* %p, !range !2
>>>
>>> I think the resulting range should be the *intersection*, and not the
>>> union of the ranges.  Someone says: "I'm sure this value is between 0
>>> and 5", and someone else says "I'm sure that value is between 3 and
>>> 6".  Then we know that the value must be between 3 and 5;  we don't
>>> need to expand our beliefs to be between 0 and 6.
>>> (of course this reasoning assumes that ranges are always conservative,
>>> which must be the case, anyway)
>>>
>> This is right.
>>
>>> For TBAA info, I think the same reasoning applies. I think we can pick
>>> the strongest aliasing information.
>>
>>
>> This is wrong :)
>> First, there is no such thing as "strongest". TBAA is a tree
>> structure, and all your ancestors and descendants may-alias.
>> At least in C, the accesses should be through things that are in the
>> same tree anyway.
>> Choosing different members of that same TBAA subtree to represent the
>> sets would not change anything.
>>
>> Ignoring this neither LLVM (nor C++, for that matter) guarantee that
>> the same *place in memory* is not referred to with disjoint TBAA
>> types.  C++ for example, has a placement new that allows type
>> changing.
>>
>> So it's perfectly legal (at least by the docs) to have a TBAA tree like
>> this:
>>
>>
>>   0
>>  /   \
>> 1   2
>> |
>> 3
>> and
>>
>> %a = load i32* %p, !tbaa !1
>> %b = load i32* %p, !tbaa !2
>>
>> While we can eliminate the second load, it would be quite wrong to say
>> that suddenly this !1 load over here is, for alias analysis purposes,
>> a !2 load because !2 is "stronger" since the tree is smaller.    They
>> aren't in the same TBAA subtree, and you are suddenly saying a whole
>> class of dependents you don't know about dont conflict.
>>
>> In short: TBAA is not may-alias info, it's type info. If it was
>> may-alias info, yes, you could have "anti-alias sets" and say that
>> something didn't conflict with something else.  But without knowing
>> the original program language semantics, you can't go willy-nilly
>> changing the tbaa info :)
>
>
> You're very right!  I was confusing this TBAA business.
> So as far as I understand, the optimal solution is to find the lowest common
> ancestor, e.g.:
>
>    0
>   /
>  1
>  / \
> 2   3
>
> merging !2 and !3 should yield !1  (meaning it can alias both !2 and !3
> sub-trees types).
>
>
> In the (unlikely) case they don't have the same root, then something is
> really wrong, and probably it's some sort of undefined behavior.

So, that would be nice behavior, but sadly, the language guide says:
"
The second field identifies the type's parent node in the tree, or is
null or omitted for a root node. A type is considered to alias all of
its descendants and all of its ancestors in the tree. Also, a type is
considered to alias all types in other trees, so that bitcode produced
from multiple front-ends is handled conservatively."

Combine this with inlining across modules, and it seems two loads with
TBAA nodes in different trees can reasonably happen.

Realistically, there needs to be some way to merge TBAA trees anyway
at least for the same language bitcode files, otherwise, you lose a
lot of usefulness.




More information about the llvm-commits mailing list