[llvm-bugs] [Bug 38788] sse4.1 all/any reductions on <4 x i32> inconsistent and possibly suboptimal

via llvm-bugs llvm-bugs at lists.llvm.org
Fri Aug 31 12:34:03 PDT 2018


https://bugs.llvm.org/show_bug.cgi?id=38788

Gonzalo BG <gonzalobg88 at gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|INVALID                     |---
             Status|RESOLVED                    |REOPENED

--- Comment #3 from Gonzalo BG <gonzalobg88 at gmail.com> ---
@Craig The following example also fails (https://gcc.godbolt.org/z/tSZ6n6)

define i1 @all_sse41(<4 x i32>) {
  %2 = bitcast <4 x i32> %0 to <2 x i64>
  %3 = tail call i32 @llvm.x86.sse41.ptestc(<2 x i64> %2, <2 x i64> <i64 -1,
i64 -1>) #2
  %4 = icmp eq i32 %3, 1
  ret i1 %4
}

define i1 @cmp(<4 x i32>, <4 x i32>) {
  %3 = icmp eq <4 x i32> %0, %1
  %4 = sext <4 x i1> %3 to <4 x i32>
  %5 = tail call i1 @all_sse41(<4 x i32> %4)
  ret i1 %5
}

after opt -O3 -mattr=+sse4.2 produces:

cmp: # @cmp
  pcmpeqd %xmm1, %xmm0
  pcmpeqd %xmm1, %xmm1
  ptest %xmm1, %xmm0
  setb %al
  retq

Note that here all_sse41 receives an input for which its result will be
equivalent to movemask because the vector has been sign extended, so it doesn't
matter if one looks at all bits or just the sign bit.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180831/85040668/attachment.html>


More information about the llvm-bugs mailing list