[PATCH] D33866: [DAGCombiner] loosen restriction for creating narrow vector load from extract(wide load)

Mon Jun 5 07:05:22 PDT 2017

spatel added a comment.

In https://reviews.llvm.org/D33866#772755, @niravd wrote:

> It looks like most of the AMDGPU cases fail because:
>
> - TLI.isExtractSubvectorCheap(VT, ExtIdxValue) is not defined for AMDGPU.
> - Legalization breaks sign-/zero-extended vectors into a concat of smaller subvectors.
>
>   The former seems easy for someone who knows AMDGPU to correct.

Actually, I see another way out. I missed this TLI hook:

  // Return true if it is profitable to reduce the given load node to a smaller
  // type.
  //
  // e.g. (i16 (trunc (i32 (load x))) -> i16 load x should be performed
  virtual bool shouldReduceLoadWidth(SDNode *Load,
                                     ISD::LoadExtType ExtTy,
                                     EVT NewVT) const {
    return true;
  }

This was originally added for AMDGPU (https://reviews.llvm.org/rL224084), so that should prevent the regressions.

https://reviews.llvm.org/D33866