[llvm] [Scalarizer][DirectX] support structs return types (PR #111569)

Thu Oct 17 12:17:50 PDT 2024

================
@@ -667,14 +686,26 @@ bool ScalarizerVisitor::splitBinary(Instruction &I, const Splitter &Split) {
 bool ScalarizerVisitor::isTriviallyScalarizable(Intrinsic::ID ID) {
   if (isTriviallyVectorizable(ID))
     return true;
+  // TODO: Move frexp to isTriviallyVectorizable.
+  // https://github.com/llvm/llvm-project/issues/112408
+  switch (ID) {
+  case Intrinsic::frexp:
+    return true;
+  }
   return Intrinsic::isTargetIntrinsic(ID) &&
          TTI->isTargetIntrinsicTriviallyScalarizable(ID);
 }
 
 /// If a call to a vector typed intrinsic function, split into a scalar call per
 /// element if possible for the intrinsic.
 bool ScalarizerVisitor::splitCall(CallInst &CI) {
-  std::optional<VectorSplit> VS = getVectorSplit(CI.getType());
+  Type *CallType = CI.getType();
+  bool AreAllMatchingVectors = isStructOfMatchingFixedVectors(CallType);
+  std::optional<VectorSplit> VS;
+  if (AreAllMatchingVectors)
+    VS = getVectorSplit(CallType->getContainedType(0));
----------------
tex3d wrote:

There seems to be a subtle potential issue.  `VS` is presumed to be the return vector type for the function.  Later, we use `if (isVectorIntrinsicWithOverloadTypeAtArg(ID, -1))` to determine whether the overload type is based on the return type, and will use this contained type.  This cannot differentiate between the elements of the struct.

Perhaps that's the intended design for this, but if so it should be made explicit, like:
> The overload type may only be dependent on the first member of the returned structure.  In other words, the other members of that returned structure can not contribute to the overload type of the intrinsic.

I think there may be another subtle problem with the component type of the other vector(s) in the returned structure.  What types must they use?  Should they match the overload type, or can they be different?  Perhaps that's a problem for earlier builtin type checking in clang, and not for the code that transforms the intrinsics, but it does seem like an issue that these other outputs are basically not addressable with the current trivially vectorizable scheme.

https://github.com/llvm/llvm-project/pull/111569