[PATCH] D149903: [VPlan] Replace IR based truncateToMinimalBitwidths with VPlan version.

Mon May 8 08:04:55 PDT 2023

Ayal added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:760
+    VPlan &Plan, const MapVector<Instruction *, uint64_t> &MinBWs) {
+  auto GetType = [](VPValue *Op) {
+    auto *UV = Op->getUnderlyingValue();
----------------
nit: can return the type size in bits, as that is what is needed here. Op >> VPV?

Thought: worth introducing as a member of VPValue, to be overridden by VPWidenCastRecipe? Note that this is Element/Scalar Type. 

================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:785
+
+      if (!isa<VPWidenRecipe, VPWidenSelectRecipe, VPWidenCastRecipe>(&R))
+        continue;
----------------
nit: this can be checked first, instead of checking for single defined value.

Thought: could/should each MinBW be attached to its recipe asap - when the latter is created, considering it depends on associated underlying instruction?

================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:788
+
+      LLVMContext &Ctx = ResTy->getContext();
+
----------------
nit: `auto ResNewTyInBits = I->second;`
nit: `auto ResNewTy = IntegerType::get(ResTy->getContext(), ResNewTyInBits);` ?

================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:791
+      // Try to replace wider SExt/ZExts with narrower ones if possible.
+      if (auto *VPW = dyn_cast<VPWidenCastRecipe>(&R)) {
+        Instruction *UI = VPW->getUnderlyingInstr();
----------------
nit: suffice to check isa<> and continue to work with R instead of VPW?

================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:792
+      if (auto *VPW = dyn_cast<VPWidenCastRecipe>(&R)) {
+        Instruction *UI = VPW->getUnderlyingInstr();
+        switch (UI->getOpcode()) {
----------------
UI is aka UV. Better call it UI from the start, as it's an Instruction* rather than Value*.

================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:798
+        case Instruction::ZExt: {
+          if (UI->getType()->getScalarSizeInBits() > I->second) {
+            if (GetType(VPW->getOperand(0))->getScalarSizeInBits() >= I->second)
----------------
UI->getType() is aka ResTy. Already early-continued if it was equal in size to I->second. Can it be smaller in size than I->second? If so worth early-continuing above, if not worth asserting?

================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:799
+          if (UI->getType()->getScalarSizeInBits() > I->second) {
+            if (GetType(VPW->getOperand(0))->getScalarSizeInBits() >= I->second)
+              break;
----------------
Operand of SExt/ZExt must be smaller in size than its result, so if result is at most I->second so must its operand be?

================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:818
+        if (auto *VPW = dyn_cast<VPWidenRecipe>(&R))
+          VPW->dropPoisonGeneratingFlags();
+
----------------
nit: first take care of creating and inserting Shrunk, then take care of R's flags drop and operand set?

================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:821
+        auto *Shrunk =
+            new VPWidenCastRecipe(Instruction::Trunc, R.getOperand(Idx),
+                                  IntegerType::get(Ctx, I->second));
----------------
R.getOperand(Idx) is aka Op.

================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:829
+      auto *Ext =
+          new VPWidenCastRecipe(Instruction::ZExt, R.getVPSingleValue(), ResTy);
+      R.getVPSingleValue()->replaceAllUsesWith(Ext);
----------------
nit: define `auto *RVPValue = R.getVPSingleValue()` above?

Would be good to have a common base class for all recipes having a single value, as this amounts to a cast.

================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:832
+      Ext->setOperand(0, R.getVPSingleValue());
+      Ext->insertAfter(&R);
+    }
----------------
Other insertions of shrunk operands and smaller extends are placed before R; this one is placed after - and calls for make_early_inc_range, right?

================
Comment at: llvm/test/Transforms/LoopVectorize/ARM/pointer_iv.ll:884
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[UGLYGEP:%.*]] = getelementptr i8, ptr [[Z:%.*]], i32 3000
-; CHECK-NEXT:    [[UGLYGEP1:%.*]] = getelementptr i8, ptr [[X:%.*]], i32 3000
-; CHECK-NEXT:    [[BOUND0:%.*]] = icmp ugt ptr [[UGLYGEP1]], [[Z]]
-; CHECK-NEXT:    [[BOUND1:%.*]] = icmp ugt ptr [[UGLYGEP]], [[X]]
+; CHECK-NEXT:    [[SCEVGEP:%.*]] = getelementptr i8, ptr [[Z:%.*]], i32 3000
+; CHECK-NEXT:    [[SCEVGEP1:%.*]] = getelementptr i8, ptr [[X:%.*]], i32 3000
----------------
(These UGLY's don't belong to this patch, but probably worth cleaning up, independent of this patch).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149903/new/

https://reviews.llvm.org/D149903