[PATCH] D136922: [AMDGPU][GISel] Widen s16 SHUFFLE_VECTOR where there are no scalar pack insts

Wed Nov 2 02:10:38 PDT 2022

Pierre-vh added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:1555
+  if (!ST.hasScalarPackInsts())
+    ShuffleVector.minScalarOrElt(0, S32);
+  ShuffleVector.lower();
----------------
arsenm wrote:
> I thought you were moving towards not using shuffle_vector for packed cases too.
> 
> Plus the packed cases still need handling for non-16 bit elements 
I'm not sure I understand. 

With D135145, the goal is to use shuffle_vector to replace insert_vector_elt in most cases (even packed ones) in a global combine.
The reason why this patch exists is because D135145 alone would cause a lot of regressions in the codebase for the no-pack-insts case due to the differences in how insert_vector_elt & shuffle_vector are lowered (shuffle implies a lot more bit manipulation stuff). If we widen the shuffle vectors before lowering them for those targets, codegen seems much better.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136922/new/

https://reviews.llvm.org/D136922