[llvm-bugs] [Bug 37796] New: Optimize bit-scatter operation
via llvm-bugs
llvm-bugs at lists.llvm.org
Wed Jun 13 15:48:28 PDT 2018
https://bugs.llvm.org/show_bug.cgi?id=37796
Bug ID: 37796
Summary: Optimize bit-scatter operation
Product: clang
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: enhancement
Priority: P
Component: C++
Assignee: unassignedclangbugs at nondot.org
Reporter: ruiu at google.com
CC: dgregor at apple.com, llvm-bugs at lists.llvm.org
I found that clang can't optimize the following code:
// This function scatter Val's bits as instructed by Mask.
// Here is an example:
//
// Val: abcd efgh ijkl mnop
// Mask: 1110 0001 1111 0001
// Result: hij0 000k lmno 000p
//
// Some CPUs support this operation as a single instruction.
// For example, Intel BMI2 extension has this operation as PDEP.
static inline uint32_t scatter(uint32_t Val, uint32_t Mask) {
uint32_t Res = 0;
uint32_t Off = 0;
for (uint32_t I = 0; I < 32; ++I)
if (Mask & (1 << I))
Res |= !!(Val & (1 << Off++)) << I;
return Res;
}
uint32_t foo(uint32_t x) {
return scatter(x, 1);
}
It can be complied to just `andl $1, %edi` on x86-64, but currently clang
compiles this to a loop that iterates 32 times (https://godbolt.org/g/jX5sNW).
If I add "#pragma unroll", clang can optimize the code
(https://godbolt.org/g/Apx7Nj)
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180613/c4d7dd8b/attachment.html>
More information about the llvm-bugs
mailing list