[llvm] [AArch64][SVE] Use SVE for scalar FP converts in streaming[-compatible] functions (PR #112564)

Thu Oct 24 06:23:21 PDT 2024

================
@@ -28295,7 +28352,21 @@ SDValue AArch64TargetLowering::LowerToPredicatedOp(SDValue Op,
                                                    unsigned NewOp) const {
   EVT VT = Op.getValueType();
   SDLoc DL(Op);
-  auto Pg = getPredicateForVector(DAG, DL, VT);
+  SDValue Pg;
+
+  // FCVTZS_ZPmZ_DtoS and FCVTZU_ZPmZ_DtoS are special cases. These operations
+  // return nxv4i32 rather than the correct nxv2i32, as nxv2i32 is an illegal
+  // unpacked type. So, in this case, we take the predicate size from the
+  // operand.
+  SDValue LastOp{};
+  if ((NewOp == AArch64ISD::FCVTZU_MERGE_PASSTHRU ||
+       NewOp == AArch64ISD::FCVTZS_MERGE_PASSTHRU) &&
----------------
MacDue wrote:

I think there is some kind of issue with the definition of `FCVTZS_ZPmZ_DtoS`, `FCVTZU_ZPmZ_DtoS`, `SCVTF_ZPmZ_StoD`, and `UCVTF_ZPmZ_StoD`. Their definitions are inconsistent with all the other conversion instructions as `nxv2i32` is an illegal type (unlike `nxv2f32`), so they're specified to use `nxv4i32`, which means the result and operand element count do not match (or vise versa). 

https://github.com/llvm/llvm-project/pull/112564