[llvm-bugs] [Bug 41512] New: Conversion from int to XMM is handled inefficiently on SSE4
via llvm-bugs
llvm-bugs at lists.llvm.org
Tue Apr 16 04:22:21 PDT 2019
https://bugs.llvm.org/show_bug.cgi?id=41512
Bug ID: 41512
Summary: Conversion from int to XMM is handled inefficiently on
SSE4
Product: libraries
Version: trunk
Hardware: PC
OS: Windows NT
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: spreis at yandex-team.ru
CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
llvm-dev at redking.me.uk, spatel+llvm at rotateright.com
Created attachment 21786
--> https://bugs.llvm.org/attachment.cgi?id=21786&action=edit
Proposed fix
In attempt to swicth all our builds to SSE4 from SSSE3 we found out that code
as simple as
const __m128i lo = _mm_cvtsi32_si128(d0[value]);
const __m128i hi = _mm_cvtsi32_si128(d0[value+1024]);
val = _mm_add_epi64(val, _mm_unpacklo_epi64(lo, hi));
or
const __m128i all = _mm_set_epi32(0, d0[value], 0, d0[value+1024]);
val = _mm_add_epi64(val, all);
When inlined into loop performs worse when compiled with -sse4.1 than with just
-ssse3.
The problem is that _mm_cvtsi32_si128() and _mm_set_epi32() both modeled via
INSERT_VECTOR_ELT, and
%13 = insertelement <4 x i32> <i32 undef, i32 0, i32 undef, i32 0>, i32 %12,
i32 0, !dbg !287
Lowered to single movd instruction prior to SSE4 and to xor+pinsrd on SSE4.
https://gcc.godbolt.org/z/qY8nkO
* Notice that in a kernel fucntion in 2nd case there are couple of movd's, but
when used in loop it results in pair of pinsrd from memory into same register.
This seems to me like poor instruction selection both from performance and code
size standpopints.
I suggset steering instruction selection for this idiomatic case of
INSERT_VECTOR_ELT to SCALAR_TO_VECTOR. This will directly lead to movd
emission.
Proposed change to lib/Target/X86/X86ISelLowering.cpp is attached.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190416/a873c978/attachment.html>
More information about the llvm-bugs
mailing list