[PATCH] D64432: [InstCombine] try to narrow a truncated load

Fri Jul 12 10:31:16 PDT 2019

spatel added a reviewer: jdoerfert.
spatel added a comment.

In D64432#1582980 <https://reviews.llvm.org/D64432#1582980>, @reames wrote:

> I'm not sure that doing this at the IR level is the best idea.  The problem is that when we narrow, we loose the dereferenceable fact about part of the memory access.  This can in turn limit other transforms which would have been profitable.  As an example:
>  a = load <2 x i8>* p
>  b = load <2 x i8>* (p+1)
>  sum = a[0] + a[1] + b[1]
>
> Narrowing the b load to i8 looses the fact that the memory location corresponding to b[0] is dereferenceable, which would prevent transforms such as:
>  a = load <4 x i8>* p
>  a[2] = 0;
>  sum = horizontal_sum(a);
>
> (Note: I'm not saying this alternate transform is always profitable.  I'm just making a point about lost opportunity.)

Yes, I agree that we can lose information by narrowing. I was hoping to avoid that conflict with D64258 <https://reviews.llvm.org/D64258>, but we need to refine our definition of 'dereferenceable'. Potentially, we could have instcombine preserve the dereferenceable range in this transform?

I don't know if it makes a difference, but my intent is to not allow narrowing for vectors in this patch by using the data layout legality check. (We could make that vector bailout explicit.) So I don't think the given example with a vector type is at risk.

  define i8 @narrowload(<2 x i8>* %p) {
    %a = load <2 x i8>, <2 x i8>* %p
    %p1 = getelementptr <2 x i8>, <2 x i8>* %p, i64 1
    %b = load <2 x i8>, <2 x i8>* %p1
    %a0 = extractelement <2 x i8> %a, i64 0
    %a1 = extractelement <2 x i8> %a, i64 1
    %b1 = extractelement <2 x i8> %b, i64 1
    %add1 = add i8 %a0, %a1
    %add2 = add i8 %add1, %b1
    ret i8 %add2
  }
  ;sum = a[0] + a[1] + b[1]

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64432/new/

https://reviews.llvm.org/D64432