[llvm] [AArch64][SVE] Use SVE for scalar FP converts in streaming[-compatible] functions (PR #112564)
Paul Walker via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 24 06:44:43 PDT 2024
================
@@ -28295,7 +28352,21 @@ SDValue AArch64TargetLowering::LowerToPredicatedOp(SDValue Op,
unsigned NewOp) const {
EVT VT = Op.getValueType();
SDLoc DL(Op);
- auto Pg = getPredicateForVector(DAG, DL, VT);
+ SDValue Pg;
+
+ // FCVTZS_ZPmZ_DtoS and FCVTZU_ZPmZ_DtoS are special cases. These operations
+ // return nxv4i32 rather than the correct nxv2i32, as nxv2i32 is an illegal
+ // unpacked type. So, in this case, we take the predicate size from the
+ // operand.
+ SDValue LastOp{};
+ if ((NewOp == AArch64ISD::FCVTZU_MERGE_PASSTHRU ||
+ NewOp == AArch64ISD::FCVTZS_MERGE_PASSTHRU) &&
----------------
paulwalker-arm wrote:
This is why their definitions take `null_frag` for the `ir_op` because they are not amenable to the stock ISD nodes (which include the target specific ones that simply add predication).
You need to take a step back and look at the original selection failure when the hacks are removed. I see:
```
LLVM ERROR: Cannot select: t15: nxv4i32 = AArch64ISD::FCVTZS_MERGE_PASSTHRU t13, t9, undef:nxv4i32
t13: nxv4i1 = AArch64ISD::PTRUE TargetConstant:i32<31>
t12: i32 = TargetConstant<31>
t9: nxv2f64 = insert_vector_elt undef:nxv2f64, t2, Constant:i64<0>
t8: nxv2f64 = undef
t2: f64,ch = CopyFromReg t0, Register:f64 %0
t1: f64 = Register %0
t7: i64 = Constant<0>
t14: nxv4i32 = undef
```
Which shows `t15` and `t9` having different elements counts, which means the DAG is malformed for the reason I highlight in `replaceScalarFPConversionWithSVE()`. First get to the point where the DAG is correct and then let's see what, if any, section failures occur.
https://github.com/llvm/llvm-project/pull/112564
More information about the llvm-commits
mailing list