[llvm] Fix scalar overload name constructed by ReplaceWithVeclib.cpp (PR #111095)

Mon Oct 7 20:38:32 PDT 2024

================
@@ -100,27 +100,38 @@ static void replaceWithTLIFunction(IntrinsicInst *II, VFInfo &Info,
 static bool replaceWithCallToVeclib(const TargetLibraryInfo &TLI,
                                     IntrinsicInst *II) {
   assert(II != nullptr && "Intrinsic cannot be null");
+  Intrinsic::ID IID = II->getIntrinsicID();
+  if (IID == Intrinsic::not_intrinsic)
+    return false;
   // At the moment VFABI assumes the return type is always widened unless it is
   // a void type.
   auto *VTy = dyn_cast<VectorType>(II->getType());
   ElementCount EC(VTy ? VTy->getElementCount() : ElementCount::getFixed(0));
+  Type *ScalarRetTy = II->getType()->getScalarType();
   // Compute the argument types of the corresponding scalar call and check that
   // all vector operands match the previously found EC.
   SmallVector<Type *, 8> ScalarArgTypes;
-  Intrinsic::ID IID = II->getIntrinsicID();
+
+  // OloadTys collects types used in scalar intrinsic overload name.
+  SmallVector<Type *, 3> OloadTys;
+  if (!ScalarRetTy->isVoidTy() &&
+      isVectorIntrinsicWithOverloadTypeAtArg(IID, -1))
+    OloadTys.push_back(ScalarRetTy);
+
   for (auto Arg : enumerate(II->args())) {
     auto *ArgTy = Arg.value()->getType();
-    if (isVectorIntrinsicWithScalarOpAtArg(IID, Arg.index())) {
-      ScalarArgTypes.push_back(ArgTy);
-    } else if (auto *VectorArgTy = dyn_cast<VectorType>(ArgTy)) {
-      ScalarArgTypes.push_back(VectorArgTy->getElementType());
+    auto *ScalarArgTy = ArgTy->getScalarType();
+    ScalarArgTypes.push_back(ScalarArgTy);
+    if (isVectorIntrinsicWithOverloadTypeAtArg(IID, Arg.index()))
+        OloadTys.push_back(ScalarArgTy);
+    if (auto *VectorArgTy = dyn_cast<VectorType>(ArgTy)) {
----------------
tex3d wrote:

> before it was isVectorIntrinsicWithScalarOpAtArg(IID, Arg.index()) is true and VectorArgTy is not NULL

That's not quite right.  It was if `isVectorIntrinsicWithScalarOpAtArg` is false and VectorArgTy is not NULL.

The difference is that before, if `isVectorIntrinsicWithScalarOpAtArg` returned true and you had a vector arg type, we would have added the vector arg type instead of a scalar type to the overload and the function signature.  This would be a vector arg present in the scalar form of the intrinsic that doesn't scale with vectorization of the intrinsic.

I don't think the trivially vectorizable intrinsic path is meant to handle that case in the first place, so I didn't think this code is meant to do so either.  So the assumption was made that we only need to add the scalar types to these sets, which simplified the code a bit.

If you think we do want to handle that case, I can change the code back to adding the (potentially) vector argument type when `isVectorIntrinsicWithScalarOpAtArg` is true.

I can think of a sort of intrinsic that could do that: a resource load or store intrinsic with a vector for coordinates that doesn't match the vector dimension for the width of the load or store.  Would we want these to be considered trivially vectorizable intrinsics?

https://github.com/llvm/llvm-project/pull/111095