[llvm-bugs] [Bug 39974] New: [X86] Vectorize scalar conversions to avoid fpu-gpr-fpu transfers
via llvm-bugs
llvm-bugs at lists.llvm.org
Wed Dec 12 05:02:45 PST 2018
https://bugs.llvm.org/show_bug.cgi?id=39974
Bug ID: 39974
Summary: [X86] Vectorize scalar conversions to avoid
fpu-gpr-fpu transfers
Product: libraries
Version: trunk
Hardware: PC
OS: Windows NT
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: llvm-dev at redking.me.uk
CC: a.bataev at hotmail.com, andrea.dibiagio at gmail.com,
craig.topper at gmail.com, lebedev.ri at gmail.com,
llvm-bugs at lists.llvm.org, llvm-dev at redking.me.uk,
spatel+llvm at rotateright.com
As mentioned on https://reviews.llvm.org/D55558
define float @cvt(<4 x i32> %a0) nounwind {
%1 = extractelement <4 x i32> %a0, i32 1
%2 = sitofp i32 %1 to float
ret float %2
}
define float @cvt_alt(<4 x i32> %a0) nounwind {
%1 = shufflevector <4 x i32> %a0, <4 x i32> undef, <4 x i32> <i32 1, i32 1,
i32 1, i32 1>
%2 = sitofp <4 x i32> %1 to <4 x float>
%3 = extractelement <4 x float> %2, i32 0
ret float %3
}
If a scalar conversion can be performed purely on the vector unit, its
typically faster and avoids fpu-gpr-fpu register transfer bottlenecks.
https://godbolt.org/z/KCm1Pk
I'm not sure if this is best performed in the backend or whether the SLP should
be considered, IIRC we've had similar discussions in the past about scalar i64
math being done on i686 SSE2 targets.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20181212/8c2f1f15/attachment.html>
More information about the llvm-bugs
mailing list