[PATCH] D141668: [X86] Do not lower INSERT_VECTOR_ELT to vselect for vXf16 without BWI
Phoebe Wang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 13 01:30:24 PST 2023
pengfei created this revision.
pengfei added reviewers: RKSimon, skan, craig.topper.
Herald added a subscriber: hiraditya.
Herald added a project: All.
pengfei requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.
We cannot handle i8/i16/f16 vselect without BWI.
Fixes #59980
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D141668
Files:
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/pr59980.ll
Index: llvm/test/CodeGen/X86/pr59980.ll
===================================================================
--- /dev/null
+++ llvm/test/CodeGen/X86/pr59980.ll
@@ -0,0 +1,39 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=x86_64-apple-macosx10.15 | FileCheck %s
+
+%Ts7Float16V = type <{ half }>
+%Ts7Float16V13SIMD16StorageV = type <{ <16 x half> }>
+
+define swiftcc void @"$ss7Float16V13SIMD16StorageVyABSicipADTk"(%Ts7Float16V* noalias nocapture dereferenceable(2) %0, %Ts7Float16V13SIMD16StorageV* nocapture dereferenceable(32) %1, i8* %2) #0 {
+; CHECK-LABEL: $ss7Float16V13SIMD16StorageVyABSicipADTk:
+; CHECK: ## %bb.0: ## %entry
+; CHECK-NEXT: pushq %rbp
+; CHECK-NEXT: movq %rsp, %rbp
+; CHECK-NEXT: andq $-32, %rsp
+; CHECK-NEXT: subq $64, %rsp
+; CHECK-NEXT: movl (%rdx), %eax
+; CHECK-NEXT: andl $15, %eax
+; CHECK-NEXT: vpinsrw $0, (%rdi), %xmm0, %xmm0
+; CHECK-NEXT: vmovups (%rsi), %ymm1
+; CHECK-NEXT: vmovaps %ymm1, (%rsp)
+; CHECK-NEXT: vpextrw $0, %xmm0, (%rsp,%rax,2)
+; CHECK-NEXT: vmovaps (%rsp), %ymm0
+; CHECK-NEXT: vmovups %ymm0, (%rsi)
+; CHECK-NEXT: movq %rbp, %rsp
+; CHECK-NEXT: popq %rbp
+; CHECK-NEXT: vzeroupper
+; CHECK-NEXT: retq
+entry:
+ %._value = bitcast i8* %2 to i64*
+ %3 = load i64, i64* %._value, align 8
+ %._value1 = getelementptr inbounds %Ts7Float16V, %Ts7Float16V* %0, i64 0, i32 0
+ %4 = load half, half* %._value1, align 2
+ %._value2 = getelementptr inbounds %Ts7Float16V13SIMD16StorageV, %Ts7Float16V13SIMD16StorageV* %1, i64 0, i32 0
+ %5 = load <16 x half>, <16 x half>* %._value2, align 16
+ %6 = trunc i64 %3 to i32
+ %7 = insertelement <16 x half> %5, half %4, i32 %6
+ store <16 x half> %7, <16 x half>* %._value2, align 16
+ ret void
+}
+
+attributes #0 = { nounwind "target-features"="+f16c" }
Index: llvm/lib/Target/X86/X86ISelLowering.cpp
===================================================================
--- llvm/lib/Target/X86/X86ISelLowering.cpp
+++ llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -20222,7 +20222,7 @@
// possible vector indices, and FP insertion has less gpr->simd traffic.
if (!(Subtarget.hasBWI() ||
(Subtarget.hasAVX512() && EltSizeInBits >= 32) ||
- (Subtarget.hasSSE41() && VT.isFloatingPoint())))
+ (Subtarget.hasSSE41() && (EltVT == MVT::f32 || EltVT == MVT::f64))))
return SDValue();
MVT IdxSVT = MVT::getIntegerVT(EltSizeInBits);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D141668.488903.patch
Type: text/x-patch
Size: 2496 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230113/26aabdd2/attachment.bin>
More information about the llvm-commits
mailing list