[llvm-bugs] [Bug 38788] sse4.1 all/any reductions on <4 x i32> inconsistent and possibly suboptimal
via llvm-bugs
llvm-bugs at lists.llvm.org
Fri Aug 31 12:34:03 PDT 2018
https://bugs.llvm.org/show_bug.cgi?id=38788
Gonzalo BG <gonzalobg88 at gmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|INVALID |---
Status|RESOLVED |REOPENED
--- Comment #3 from Gonzalo BG <gonzalobg88 at gmail.com> ---
@Craig The following example also fails (https://gcc.godbolt.org/z/tSZ6n6)
define i1 @all_sse41(<4 x i32>) {
%2 = bitcast <4 x i32> %0 to <2 x i64>
%3 = tail call i32 @llvm.x86.sse41.ptestc(<2 x i64> %2, <2 x i64> <i64 -1,
i64 -1>) #2
%4 = icmp eq i32 %3, 1
ret i1 %4
}
define i1 @cmp(<4 x i32>, <4 x i32>) {
%3 = icmp eq <4 x i32> %0, %1
%4 = sext <4 x i1> %3 to <4 x i32>
%5 = tail call i1 @all_sse41(<4 x i32> %4)
ret i1 %5
}
after opt -O3 -mattr=+sse4.2 produces:
cmp: # @cmp
pcmpeqd %xmm1, %xmm0
pcmpeqd %xmm1, %xmm1
ptest %xmm1, %xmm0
setb %al
retq
Note that here all_sse41 receives an input for which its result will be
equivalent to movemask because the vector has been sign extended, so it doesn't
matter if one looks at all bits or just the sign bit.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180831/85040668/attachment.html>
More information about the llvm-bugs
mailing list