[llvm-bugs] [Bug 44781] New: Unnecessary use of SIMD
via llvm-bugs
llvm-bugs at lists.llvm.org
Tue Feb 4 12:51:11 PST 2020
https://bugs.llvm.org/show_bug.cgi?id=44781
Bug ID: 44781
Summary: Unnecessary use of SIMD
Product: clang
Version: unspecified
Hardware: PC
OS: All
Status: NEW
Severity: enhancement
Priority: P
Component: -New Bugs
Assignee: unassignedclangbugs at nondot.org
Reporter: kobalicek.petr at gmail.com
CC: htmldeveloper at gmail.com, llvm-bugs at lists.llvm.org,
neeilans at live.com, richard-llvm at metafoo.co.uk
Clang tries sometimes so hard to use SIMD that it generates worse code than
actually not using it. A very simple example below:
struct Box {
int x0, y0, x1, y1;
};
bool check(const Box& box) noexcept {
return ((box.x0 | box.y0 | box.x1 | box.y1) & 0xFFu) == 0;
}
Clang 8+ compiles this code to:
check(Box const&):
vmovdqu xmm0, xmmword ptr [rdi]
vpshufd xmm1, xmm0, 78 # xmm1 = xmm0[2,3,0,1]
vpor xmm0, xmm0, xmm1
vpshufd xmm1, xmm0, 229 # xmm1 = xmm0[1,1,2,3]
vpor xmm0, xmm0, xmm1
vpextrb eax, xmm0, 0
test al, al
sete al
ret
Whereas GCC (and Clang 7 and lesser) is just really fine with a scalar code:
check(Box const&):
mov eax, DWORD PTR [rdi]
or eax, DWORD PTR [rdi+4]
or eax, DWORD PTR [rdi+8]
or eax, DWORD PTR [rdi+12]
test al, al
sete al
ret
The resulting SIMD code Clang produces is just worse (and bigger) than the
scalar one, and I think it would also be slower as it contains shuffles and
extract.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20200204/aae9cd9d/attachment-0001.html>
More information about the llvm-bugs
mailing list