[PATCH] D157525: [LV][AArch64] Enable scalable vectorization of loops that contain FREM instructions

Thu Aug 10 02:37:48 PDT 2023

huntergr added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7224-7238
+    // Certain instructions can be cheaper to vectorize if they have a constant
+    // second vector operand. One example of this are shifts on x86.
+    Value *Op2 = I->getOperand(1);
+    auto Op2Info = TTI.getOperandInfo(Op2);
+    if (Op2Info.Kind == TargetTransformInfo::OK_AnyValue &&
+        Legal->isInvariant(Op2))
+      Op2Info.Kind = TargetTransformInfo::OK_UniformValue;
----------------
I don't think we need to copy this part from the above -- it's very unlikely that we'll have a cheaper version of frem with an invariant second operand.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7247-7252
+        if (Ty->isFloatTy())
+          ScalarFnName = TLI->getName(LibFunc_fmodf);
+        else if (Ty->isDoubleTy())
+          ScalarFnName = TLI->getName(LibFunc_fmod);
+        else
+          return InstructionCost::getInvalid();
----------------
I think we want to check TargetTransformInfo for a cost instead of just returning invalid if there's no vector function available. While AArch64 doesn't have a hardware frem implementation, there may be another architecture which does and we don't want to prevent that from being used.

Gathering costs for a vector/scalarized frem instruction vs. a library call and returning the cheapest cost would be a good plan.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157525/new/

https://reviews.llvm.org/D157525