[llvm-bugs] [Bug 32176] New: _mm_undefined_si128 compiles to incorrect SSE code with -O1 or higher
via llvm-bugs
llvm-bugs at lists.llvm.org
Tue Mar 7 16:36:21 PST 2017
https://bugs.llvm.org/show_bug.cgi?id=32176
Bug ID: 32176
Summary: _mm_undefined_si128 compiles to incorrect SSE code
with -O1 or higher
Product: clang
Version: 4.0
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: LLVM Codegen
Assignee: unassignedclangbugs at nondot.org
Reporter: myriachan at gmail.com
CC: llvm-bugs at lists.llvm.org
_mm_undefined_si128 (internally, __builtin_ia32_undef128) is designed to allow
writing x86 SSE code that uses the existing values of SSE registers without
regard to their current contents. An example is the following code to generate
an SSE register with all "1" bits:
__m128i ReturnOneBits()
{
__m128i dummy = _mm_undefined_si128();
return _mm_cmpeq_epi32(dummy, dummy);
}
It should compile to something like this:
pcmpeqd %xmm0, %xmm0
retq
But instead, with -O1, -O2 or -O3, it compiles to this:
xorps %xmm0, %xmm0
retq
In other words, it returns all "0" bits instead of all "1" bits. (With
optimizations disabled, the generated code reads uninitialized memory then does
pcmpeqd on the two values,
The following function *does* compile correctly, and clang in fact sees that
zeroing a register beforehand is unnecessary:
__m128i ReturnOneBits()
{
__m128i dummy = _mm_setzero_si128();
return _mm_cmpeq_epi32(dummy, dummy);
// -or-
return _mm_set_epi32(-1, -1, -1, -1);
}
These compile to:
pcmpeqd %xmm0, %xmm0
retq
Because clang's optimizer realizes that it doesn't care about the previous
value of xmm0, it actually would be an acceptable solution if
__builtin_ia32_undef128 were removed from the compiler and _mm_undefined_si128
simply called _mm_setzero_si128. (This is what Microsoft Visual C++ does, in
fact.)
I have not tried the other _mm*_undefined* functions.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20170308/a5a6792f/attachment-0001.html>
More information about the llvm-bugs
mailing list