[llvm-dev] PartialAlias: different start addresses

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Mon Jul 17 03:21:21 PDT 2017


On 07/17/2017 03:38 AM, Nuno Lopes wrote:
> Quoting Daniel Berlin <dberlin at dberlin.org>:
>
>> On Sun, Jul 16, 2017 at 2:58 PM, Daniel Berlin <dberlin at dberlin.org> 
>> wrote:
>>
>>>
>>>
>>> On Sun, Jul 16, 2017 at 2:34 PM, Nuno Lopes <nunoplopes at sapo.pt> wrote:
>>>
>>>> On Sun, Jul 16, 2017, 12:45 PM Nuno Lopes wrote:
>>>>>>
>>>>>>> On 07/15/2017 04:51 AM, Nuno Lopes wrote:
>>>>>>>
>>>>>>>> On 07/14/2017 04:37 PM, Nuno Lopes wrote:
>>>>>>>>>
>>>>>>>>>> Thank you all for your replies.
>>>>>>>>>> So here seems to be an agreement that the documentation for
>>>>>>>>>> PartialAlias is incorrect.
>>>>>>>>>>
>>>>>>>>>> Daniel: now you got me wondering about MustAlias. This is 
>>>>>>>>>> what the
>>>>>>>>>> docs say:
>>>>>>>>>> "The MustAlias response may only be returned if the two memory
>>>>>>>>>> objects are *guaranteed to always start at exactly the same
>>>>>>>>>> location*"
>>>>>>>>>>
>>>>>>>>>> This statement is regardless of the access sizes.  For 
>>>>>>>>>> example, in
>>>>>>>>>> SCEV AA:
>>>>>>>>>> // If they evaluate to the same expression, it's a MustAlias.
>>>>>>>>>> if (AS == BS)
>>>>>>>>>>  return MustAlias;
>>>>>>>>>>
>>>>>>>>>> AS/BS are scev expressions for the pointers. So no check for the
>>>>>>>>>> access size.
>>>>>>>>>>
>>>>>>>>>> So, does must needs to check for access sizes? If so, SCEV AA is
>>>>>>>>>> buggy and the documentation needs tweaking.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm under the impression that there is code that depends on 
>>>>>>>>> the size
>>>>>>>>> check, but I don't trust my recollection in this regard. SCEV 
>>>>>>>>> AA is
>>>>>>>>> known to cause miscompiles, IIRC, maybe you just found out why ;)
>>>>>>>>>
>>>>>>>>
>>>>>>>> It's true that the CFL AAs have this code:
>>>>>>>> if (LocA.Ptr == LocB.Ptr)
>>>>>>>>  return LocA.Size == LocB.Size ? MustAlias : PartialAlias;
>>>>>>>>
>>>>>>>>
>>>>>>>> I grepped for clients of MustAlias:
>>>>>>>> ~/llvm/lib/Transforms $ grep -Rl MustAlias .
>>>>>>>> ./ObjCARC/ObjCARCOpts.cpp
>>>>>>>> ./ObjCARC/ProvenanceAnalysis.cpp
>>>>>>>> ./Scalar/DeadStoreElimination.cpp
>>>>>>>> ./Scalar/GVN.cpp
>>>>>>>> ./Scalar/LICM.cpp
>>>>>>>> ./Scalar/LoopVersioningLICM.cpp
>>>>>>>> ./Scalar/MemCpyOptimizer.cpp
>>>>>>>> ./Scalar/MergedLoadStoreMotion.cpp
>>>>>>>> ./Scalar/NewGVN.cpp
>>>>>>>> ./Utils/VNCoercion.cpp
>>>>>>>>
>>>>>>>> I glanced over all the uses in these files and I couldn't find any
>>>>>>>> usage that requires sizes to match.  Actually most clients check
>>>>>>>> access sizes themselves. Most don't need equal sizes, just need 
>>>>>>>> one to
>>>>>>>> be smaller than the other.
>>>>>>>>
>>>>>>>
>>>>>>> Does aliasing actually check both ways?
>>>>>>> Otherwise, alias (A, B) will give different results than alias 
>>>>>>> (B, A).
>>>>>>>
>>>>>>
>>>>>> Sorry for the delay.
>>>>>> I'm not sure I understood what you wrote, sorry.  What you wrote is
>>>>>> true in
>>>>>> general, but I don't see how MustAlias in particular is worse 
>>>>>> than the
>>>>>> other
>>>>>> AA results.
>>>>>>
>>>>>
>>>>> Historically, in llvm, we have guaranteed that alias(a, b) 
>>>>> ==alias(b, a)
>>>>>
>>>>> If it does:
>>>>>
>>>>> If start (a) == start (b)
>>>>>   If size(b) < size(a)
>>>>>       Return mustalias
>>>>>   Return may or partial
>>>>>
>>>>> It will give different answers for alias(a, b) and alias (b, a)
>>>>>
>>>>
>>>> Wait, but the size check is done in the clients, not in AA.
>>>> AA only does (according to my understanding):
>>>>
>>>> If start (a) == start (b)
>>>>     Return mustalias
>>>> Return may or partial
>>>>
>>>>
>>> That seems ... wrong to me, but i haven't thought very hard about it.
>>>
>>
>> In particular, this would mean the documentation for partialalias *is*
>> correct :P.
>> IE it will only return partial if the starting addresses are not the 
>> same.
>>
>> I actually would have expected something like:
>>
>> if start(a) == start (b):
>>   if sizes equal:
>>     return mustalias
>>   else:
>>     return partialalias
>> if otherwise overlap
>>    return partial
>> return may
>>
>> That said, i feel like both our existing code is trying to encode too 
>> much
>> info into these simple enum answers.
>>
>> Maybe we should really be having something like:
>>
>> enum aliaskind {
>> MustAlias,  // Exactly the same start and size
>> PartialAliasSameStart,
>> PartialAlias,
>> Mayalias
>> }
>> enum overlapkind {
>> FullOverlap, // Either the accesses are the same size, or one fully 
>> covers
>> the other
>> PartialOverlap, // They partially cover each other
>> Unknown // No idea
>> }
>>
>> std::pair<aliaskind, overlapkind> alias(A, B) {
>> if (start(A) == start(B))
>>   if sizes unknown:
>>      return {PartialAliasSameStart, Unknown}
>>   if sizes equal:
>>      return {MustAlias, FullOverlap}
>>   if one smaller than the other
>>      return {PartialAliasSameStart, FullOverlap}
>> etc
>>
>> Then we don't have to muck around in the definitions of must/partial/may
>> too much, and the clients don't have to do too much when, for 
>> example, the
>> CSE/DSE like ones really want to know either "are these accessing 
>> exactly
>> the same memory location" *or* "can i eliminate this load/store in 
>> favor of
>> of an earlier/later one, maybe with some masking".
>>
>> But maybe this doesn't need cleaning up.
>
> You are the AA expert, not me :)  I was just trying to formalize 
> LLVM's AA stuff and noticed that some things were not very consistent.
> BTW, we need to take into account over-approximation of the analyses. 
> So we cannot/shouldn't say that PartialAlias only happens when 
> addresses are different because this imposes extra burden on the AA: 
> it would have to prove that.

Our general past position is that AA may return MustAlias and 
PartialAlias only when it can prove the preconditions. Otherwise, it 
should return MayAlias. We've certainly had bugs in the past in this 
area (i.e. where AA returns MustAlias or PartialAlias when it didn't 
properly prove the preconditions), but if clients are to be able to use 
these return values meaningfully, I think we need to keep it this way 
(i.e. if we don't require AA to prove the preconditions, what's the point?).

  -Hal

>   I believe it's better if PartialAlias doesn't carry any information 
> about the addresses. That way MustAlias implies PartialAlias.
> Anyway, this depends on what clients actually need. I don't have 
> enough experience in this area to make any judgement about that; I'll 
> leave that with you :)
>
> Nuno
>

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory



More information about the llvm-dev mailing list