[PATCH] D114198: [GlobalISel] Rework more/fewer elements for vectors

Mon Dec 20 14:31:21 PST 2021

arsenm added a comment.

Thanks, this mostly looks less ugly than before

================
Comment at: llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:995
+  case TargetOpcode::G_FREEZE: {
+    if (TypeIdx != 0)
+      return UnableToLegalize;
----------------
Why this change? I thought this previously handled arbitrary cases, and this doesn't?

================
Comment at: llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:3634

-LegalizerHelper::LegalizeResult LegalizerHelper::fewerElementsVectorImplicitDef(
-    MachineInstr &MI, unsigned TypeIdx, LLT NarrowTy) {
-  Register DstReg = MI.getOperand(0).getReg();
-  LLT DstTy = MRI.getType(DstReg);
-  LLT LCMTy = getLCMType(DstTy, NarrowTy);
+namespace {

----------------
I think the prevailing style is to use static functions instead of anonymous namespaces

================
Comment at: llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:3647
+    std::initializer_list<unsigned> NonVecOpIndices) {
+  if (MI.getNumMemOperands() != 0)
+    return false;
----------------
Not sure what the point of this check is

================
Comment at: llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:3825
+  for (unsigned i = 0; i < OrigNumElts / NumElts + NumLeftovers; ++i) {
+    auto Phi = MIRBuilder.buildInstr(MI.getOpcode());
+    Phi.addDef(
----------------
Might as well hardcode G_PHI

================
Comment at: llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:6600
+  if (mi_match(Idx, MRI, m_ICst(IdxVal)) && IdxVal <= NumElts) {
+    SmallVector<Register, 2> SrcRegs;
+    extractParts(SrcVec, EltTy, NumElts, SrcRegs);
----------------
Use the default small size?

================
Comment at: llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:6870
-      auto IdxK = MIRBuilder.buildConstant(IdxTy, ExtractIdx);
-      auto Extract = MIRBuilder.buildExtractVectorElement(EltTy, SrcVec, IdxK);
-      BuildVec.push_back(Extract.getReg(0));
----------------
I think it's more appropriate to lower the vector operation in terms of G_EXTRACT_VECTOR_ELT and not direct to the register extracts, G_SHUFFLE_VECTOR is not a legalization artifact

================
Comment at: llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp:231
+  auto Unmerge = buildUnmerge(Op0Ty.getElementType(), Op0);
+  SmallVector<Register, 4> Regs;
+  for (auto Op : Unmerge.getInstr()->defs())
----------------
4 seems too small, use the default or 8?

================
Comment at: llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp:253
+
+  SmallVector<Register, 4> Regs;
+  auto Unmerge = buildUnmerge(Op0Ty.getElementType(), Op0);
----------------
Ditto

================
Comment at: llvm/lib/CodeGen/GlobalISel/Utils.cpp:938
+  LLT EltTy = OrigTy.getElementType();
+  return (NumElts == 1) ? EltTy : LLT::fixed_vector(NumElts, EltTy);
+}
----------------
I'm not sure what to do about scalable vectors, but no legalization has been added for those as far as I know

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:2248-2251
+  if (IdxVal < VecTy.getNumElements()) {
+    auto Unmerge = B.buildUnmerge(EltTy, Vec);
+    B.buildCopy(Dst, Unmerge.getReg(IdxVal));
+  } else
----------------
Can you split this into a separate patch? I know I have one that does the same thing somewhere

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:2282

-  if (IdxVal < VecTy.getNumElements())
-    B.buildInsert(Dst, Vec, Ins, IdxVal * EltTy.getSizeInBits());
-  else
-    B.buildUndef(Dst);
+  unsigned NumElts = VecTy.getNumElements();
+  if (IdxVal < NumElts) {
----------------
Can you also split this, as it's not dependent on the larger legalizer changes or new builder functions

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:2284
+  if (IdxVal < NumElts) {
+    SmallVector<Register, 2> SrcRegs;
+    for (unsigned i = 0; i < NumElts; ++i)
----------------
2 seems too small, most vectors are probably 4 or 8

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:4696-4712
+    if (IsTFE) {
+      if (ResultRegs.size() == 1) {
+        NewResultReg = ResultRegs[0];
+      } else if (ResultRegs.size() == 2) {
+        LLT V4S16 = LLT::fixed_vector(4, 16);
+        NewResultReg = B.buildConcatVectors(V4S16, ResultRegs).getReg(0);
+      } else {
----------------
This is a lot longer, is there a simpler way?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114198/new/

https://reviews.llvm.org/D114198