[llvm] [AArch64] Fix selection of extend of v1if16 SETCC (PR #140274)

via llvm-commits llvm-commits at lists.llvm.org
Fri May 16 09:14:57 PDT 2025


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-llvm-selectiondag

Author: Benjamin Maxwell (MacDue)

<details>
<summary>Changes</summary>

There is a DAG combine, that folds:

```
t1: v1i1 = setcc x:v1f16, y:v1f16, setogt:ch
	t2: v1i64 = zero_extend t1
```

->

```
t1: v1i16 = setcc x:v1f16, y:v1f16, setogt:ch
	t2: v1i64 = any_extend t1
```

This creates an issue on AArch64 when attempting to widen the result to `v4i16`. The operand types (`v1f16`) are set to be scalarized, so the "by hand" widening with `DAG.WidenVector` is used for them, however, this only widens to the next power-of-2, so returns `v2f16`, which does not match the result VF. The fix is to manually construct the widened inputs using `INSERT_SUBVECTOR`.

Fixes #<!-- -->136540

---
Full diff: https://github.com/llvm/llvm-project/pull/140274.diff


2 Files Affected:

- (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (+6-2) 
- (modified) llvm/test/CodeGen/AArch64/arm64-neon-v1i1-setcc.ll (+16) 


``````````diff
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
index 0c0e700f6abca..ac7e4f5ab9e20 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
@@ -6623,8 +6623,12 @@ SDValue DAGTypeLegalizer::WidenVecRes_SETCC(SDNode *N) {
     InOp1 = GetWidenedVector(InOp1);
     InOp2 = GetWidenedVector(InOp2);
   } else {
-    InOp1 = DAG.WidenVector(InOp1, SDLoc(N));
-    InOp2 = DAG.WidenVector(InOp2, SDLoc(N));
+    SDValue Undef = DAG.getUNDEF(WidenInVT);
+    SDValue ZeroIdx = DAG.getVectorIdxConstant(0, SDLoc(N));
+    InOp1 = DAG.getNode(ISD::INSERT_SUBVECTOR, SDLoc(N), WidenInVT, Undef,
+                        InOp1, ZeroIdx);
+    InOp2 = DAG.getNode(ISD::INSERT_SUBVECTOR, SDLoc(N), WidenInVT, Undef,
+                        InOp2, ZeroIdx);
   }
 
   // Assume that the input and output will be widen appropriately.  If not,
diff --git a/llvm/test/CodeGen/AArch64/arm64-neon-v1i1-setcc.ll b/llvm/test/CodeGen/AArch64/arm64-neon-v1i1-setcc.ll
index 6c70d19a977a5..5ed362604dc5f 100644
--- a/llvm/test/CodeGen/AArch64/arm64-neon-v1i1-setcc.ll
+++ b/llvm/test/CodeGen/AArch64/arm64-neon-v1i1-setcc.ll
@@ -249,3 +249,19 @@ if.then:
 if.end:
   ret i32 1;
 }
+
+define <1 x i64> @test_zext_half(<1 x half> %v1) {
+; CHECK-LABEL: test_zext_half:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    // kill: def $h0 killed $h0 def $d0
+; CHECK-NEXT:    mov w8, #1 // =0x1
+; CHECK-NEXT:    fcvtl v0.4s, v0.4h
+; CHECK-NEXT:    fmov d1, x8
+; CHECK-NEXT:    fcmgt v0.4s, v0.4s, #0.0
+; CHECK-NEXT:    xtn v0.4h, v0.4s
+; CHECK-NEXT:    and v0.8b, v0.8b, v1.8b
+; CHECK-NEXT:    ret
+  %1 = fcmp ogt <1 x half> %v1, zeroinitializer
+  %2 = zext <1 x i1> %1 to <1 x i64>
+  ret <1 x i64> %2
+}

``````````

</details>


https://github.com/llvm/llvm-project/pull/140274


More information about the llvm-commits mailing list