[PATCH] D21246: [InstCombine] Add rule to fold away trunc of partial load

Sanjoy Das via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 22 11:25:35 PDT 2016


sanjoy added inline comments.

================
Comment at: include/llvm/Analysis/Loads.h:82
@@ +81,3 @@
+/// This function returns value available at memory address *Ptr of type AccessTy.
+/// All sanity checks to confirm if this function should be called and whether
+/// the value returned can be used should be done at the caller.
----------------
sanjoy wrote:
> anna wrote:
> > sanjoy wrote:
> > > I think "sanity checks" is too vague to be useful.  I think it is sufficient to state that we've assumed that we're folding a non-volatile unordered (but potentially atomic) load -- it is understood that people calling this function know what they want. :)
> > I think we will miss some subtle points in that case, and will only be considering one usage of this function. 
> > Basically, we do not make any assumptions on the memory operation, i.e. it need not be unordered non-volatile. This constraint is only required for the call from the `FindAvailableLoadedValue(LoadInst *Load,...)`. We do not need this constraint for the call from `InstCombiner::visitTrunc`, since we do not fold away the load.
> > Also, the transformation using the value returned, can be a fold of the load as done in most cases, or fold of the trunc and nothing done to the load.
> > 
> > The idea is that it is up to the caller to decide what sort of memory operations they are interested in (may or may not be an unordered load) and what transformation needs to be done based on the value (may or may not be folding of load).
> > 
> > I thought this would be a very detailed explanation, and decided to state it as 'sanity checks' :)
> > We do not need this constraint for the call from
> > InstCombiner::visitTrunc, since we do not fold away the load.
> 
> I'm not sure if we are okay folding volatile loads, but I realized
> (just now) that we're not okay folding atomic loads in
> `visitTrunc`.  Our program could have been:
> 
> ```
> %ptr is initially 0
> 
> ThreadA:
>   store atomic i32 1, i32* %ptr
> 
> ThreadB:
>   %prevva = load atomic i16, (bitcast %ptr)
>   %val = load atomic i32, i32* %ptr
>   print(%val)
>   print(trunc(%val))
> ```
> 
> Right now the program above can print `0, 0` or `1, 1`, but after your
> change, `ThreadB` will be:
> 
> 
> ```
> ThreadB:
>   %prevva = load atomic i16, (bitcast %ptr)
>   %val = load atomic i32, i32* %ptr
>   print(%val)
>   print(%prevva)
> ```
> 
> and will alllow the execution `1, 0` and `0, 1`.  I think we can fold
> atomic loads only if the load has `trunc` as the only use.
That example was not correct, to be precise, this is what I'm talking
about:

```
%ptr is 0 initially

ThreadA:
  store atomic i32 -1, i32* %ptr

ThreadB:
  %val0 = load atomic i16, i16* (bitcast %ptr)
  %val1 = load atomic i32, i32* %ptr
  %val2 = trunc i32 %val1 to i16
  print(%val1, %val2)
```

Now `ThreadB` can print only "0, 0" or "-1, -1", but after you replace
%val2 with %val0, it will be possible for it to print "0, -1" or "-1,
0".


http://reviews.llvm.org/D21246





More information about the llvm-commits mailing list