[llvm] [DAG] Add TRUNCATE_SSAT_S/U and TRUNCATE_USAT_U to canCreateUndefOrPoison (#152143) (PR #168809)

Jerry Dang via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 20 07:29:54 PST 2025


================
@@ -0,0 +1,64 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx2 | FileCheck %s
+
+; Test that freeze is eliminated for saturation truncate patterns.
+; The freeze elimination happens at the IR level due to the IntrNoCreateUndefOrPoison
+; attribute on the llvm.smax/smin/umin intrinsics. At the SelectionDAG level,
+; TRUNCATE_SSAT_S/U and TRUNCATE_USAT_U operations are also marked in
+; canCreateUndefOrPoison() to ensure consistency and enable potential future
+; optimizations. This test validates the end-to-end behavior that no freeze
+; instruction appears in the output.
----------------
kuroyukiasuna wrote:

@RKSimon Something I tried:
```
define <4 x i32> @sqxtun_freeze_zext_and(<4 x i32> %a) {
; CHECK-LABEL: sqxtun_freeze_zext_and:
; CHECK:       // %bb.0:
; CHECK-NEXT:    sqxtun v0.4h, v0.4s
; CHECK-NEXT:    ushll v0.4s, v0.4h, #0
; CHECK-NEXT:    ret
  %trunc = tail call <4 x i16> @llvm.aarch64.neon.sqxtun.v4i16(<4 x i32> %a)
  %freeze = freeze <4 x i16> %trunc
  %zext = zext <4 x i16> %freeze to <4 x i32>
  %and = and <4 x i32> %zext, <i32 65535, i32 65535, i32 65535, i32 65535>
  ret <4 x i32> %and
}

declare <4 x i16> @llvm.aarch64.neon.sqxtun.v4i16(<4 x i32>)
```

I can verified the freeze elimination is working:

Legalized selection DAG (9 nodes):
```
t13: v4i16 = truncate_ssat_u t2
t5: v4i16 = freeze t13
t6: v4i32 = zero_extend t5
```
Optimized legalized selection DAG (8 nodes):
```
t13: v4i16 = truncate_ssat_u t2
t6: v4i32 = zero_extend t13
```

However, the final assembly is the same with or without my change, as a result the test would always pass. Seems the freeze is eliminated during instruction selection? Need some suggestion on a test pattern where this freeze elimination produces visibly different assembly - or is that the correct path to pursue? Thanks!

https://github.com/llvm/llvm-project/pull/168809


More information about the llvm-commits mailing list