[llvm] [X86] Avoid zero extend i16 when inserting fp16 (PR #126194)
Phoebe Wang via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 7 00:39:06 PST 2025
================
@@ -43,15 +43,15 @@ define void @v_test_canonicalize__half(half addrspace(1)* %out) nounwind {
;
; AVX512-LABEL: v_test_canonicalize__half:
; AVX512: # %bb.0: # %entry
-; AVX512-NEXT: movzwl (%rdi), %eax
-; AVX512-NEXT: movzwl {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %ecx
-; AVX512-NEXT: vmovd %ecx, %xmm0
-; AVX512-NEXT: vcvtph2ps %xmm0, %xmm0
-; AVX512-NEXT: vmovd %eax, %xmm1
+; AVX512-NEXT: vpinsrw $0, (%rdi), %xmm0, %xmm0
+; AVX512-NEXT: vpinsrw $0, {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm1
+; AVX512-NEXT: vpxor %xmm2, %xmm2, %xmm2
+; AVX512-NEXT: vpblendw {{.*#+}} xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
----------------
phoebewang wrote:
Ok, limit it to non-strict FP only.
https://github.com/llvm/llvm-project/pull/126194
More information about the llvm-commits
mailing list