[llvm] [AArch64] Combine signext_inreg of setcc(... != splat(0)) (PR #157665)
David Sherwood via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 9 08:06:14 PDT 2025
================
@@ -26097,6 +26097,17 @@ static SDValue performSetCCPunpkCombine(SDNode *N, SelectionDAG &DAG) {
return SDValue();
}
+static bool isSignExtInReg(const SDValue &V) {
+ if (V.getOpcode() != AArch64ISD::VASHR ||
----------------
david-arm wrote:
This feels quite late in the pipeline if we're relying upon AArch64 ISD nodes.
When lowering ctz_v16i1 I see this in the debug output:
```
Type-legalized selection DAG: %bb.0 'ctz_v16i1:'
SelectionDAG has 19 nodes:
t0: ch,glue = EntryToken
t12: nxv16i1 = AArch64ISD::PTRUE TargetConstant:i32<9>
t2: v16i8,ch = CopyFromReg t0, Register:v16i8 %0
t23: v16i8 = sign_extend_inreg t2, ValueType:ch:v16i1
t15: nxv16i8 = insert_subvector undef:nxv16i8, t23, Constant:i64<0>
t17: nxv16i8 = splat_vector Constant:i32<0>
t19: nxv16i1 = AArch64ISD::SETCC_MERGE_ZERO t12, t15, t17, setne:ch
t20: i64 = AArch64ISD::CTTZ_ELTS t19
t21: i32 = truncate t20
t8: ch,glue = CopyToReg t0, Register:i32 $w0, t21
t9: ch = AArch64ISD::RET_GLUE t8, Register:i32 $w0, t8:1
```
and there is a run of DAGCombiner immediately afterwards, which suggests that you can do this optimisation earlier and look for the SIGN_EXTEND_INREG node instead. In theory you should be able to make the codegen even better, essentially by doing:
```
// setcc_merge_zero(
// pred, insert_subvector(undef, signext_inreg(vNi1 x), 0), != splat(0))
// => setcc_merge_zero(
// pred, insert_subvector(undef, x, 0), != splat(0))
```
That way you can get rid of the remaining shl instruction I think, which is also unnecessary.
https://github.com/llvm/llvm-project/pull/157665
More information about the llvm-commits
mailing list