[PATCH] D21246: [InstCombine] Add rule to fold away trunc of partial load

Anna Thomas via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 22 12:17:52 PDT 2016


anna added inline comments.

================
Comment at: include/llvm/Analysis/Loads.h:82
@@ +81,3 @@
+/// This function returns value available at memory address *Ptr of type AccessTy.
+/// All sanity checks to confirm if this function should be called and whether
+/// the value returned can be used should be done at the caller.
----------------
sanjoy wrote:
> sanjoy wrote:
> > anna wrote:
> > > sanjoy wrote:
> > > > I think "sanity checks" is too vague to be useful.  I think it is sufficient to state that we've assumed that we're folding a non-volatile unordered (but potentially atomic) load -- it is understood that people calling this function know what they want. :)
> > > I think we will miss some subtle points in that case, and will only be considering one usage of this function. 
> > > Basically, we do not make any assumptions on the memory operation, i.e. it need not be unordered non-volatile. This constraint is only required for the call from the `FindAvailableLoadedValue(LoadInst *Load,...)`. We do not need this constraint for the call from `InstCombiner::visitTrunc`, since we do not fold away the load.
> > > Also, the transformation using the value returned, can be a fold of the load as done in most cases, or fold of the trunc and nothing done to the load.
> > > 
> > > The idea is that it is up to the caller to decide what sort of memory operations they are interested in (may or may not be an unordered load) and what transformation needs to be done based on the value (may or may not be folding of load).
> > > 
> > > I thought this would be a very detailed explanation, and decided to state it as 'sanity checks' :)
> > > We do not need this constraint for the call from
> > > InstCombiner::visitTrunc, since we do not fold away the load.
> > 
> > I'm not sure if we are okay folding volatile loads, but I realized
> > (just now) that we're not okay folding atomic loads in
> > `visitTrunc`.  Our program could have been:
> > 
> > ```
> > %ptr is initially 0
> > 
> > ThreadA:
> >   store atomic i32 1, i32* %ptr
> > 
> > ThreadB:
> >   %prevva = load atomic i16, (bitcast %ptr)
> >   %val = load atomic i32, i32* %ptr
> >   print(%val)
> >   print(trunc(%val))
> > ```
> > 
> > Right now the program above can print `0, 0` or `1, 1`, but after your
> > change, `ThreadB` will be:
> > 
> > 
> > ```
> > ThreadB:
> >   %prevva = load atomic i16, (bitcast %ptr)
> >   %val = load atomic i32, i32* %ptr
> >   print(%val)
> >   print(%prevva)
> > ```
> > 
> > and will alllow the execution `1, 0` and `0, 1`.  I think we can fold
> > atomic loads only if the load has `trunc` as the only use.
> That example was not correct, to be precise, this is what I'm talking
> about:
> 
> ```
> %ptr is 0 initially
> 
> ThreadA:
>   store atomic i32 -1, i32* %ptr
> 
> ThreadB:
>   %val0 = load atomic i16, i16* (bitcast %ptr)
>   %val1 = load atomic i32, i32* %ptr
>   %val2 = trunc i32 %val1 to i16
>   print(%val1, %val2)
> ```
> 
> Now `ThreadB` can print only "0, 0" or "-1, -1", but after you replace
> %val2 with %val0, it will be possible for it to print "0, -1" or "-1,
> 0".
Good catch! I initially thought `FindAvailableLoadedValue` atomicity check may handle the case you've written: ` if (LI->isAtomic() < isAtomicMemOp) return nullptr`, but here both loads are atomic, and this would cause incorrect results only in `visitTrunc`. 

I'll have this check before deciding if `visitTrunc` can call this function (include volatile check for safety). 

```
if (auto *LI = dyn_cast<LoadInst>(Src)) { // source operand of trunc is load
if(!LI->isVolatile() && (!LI->isAtomic() || LI->hasOneUse()) )
   call FindAvailableLoadedValue
}
```
and add above example within `theadB` as testcase to see trunc not removed.



http://reviews.llvm.org/D21246





More information about the llvm-commits mailing list