[llvm-bugs] [Bug 51288] New: Convert mov and shr to shrx in loops constrained by retirement rate
via llvm-bugs
llvm-bugs at lists.llvm.org
Fri Jul 30 21:37:34 PDT 2021
https://bugs.llvm.org/show_bug.cgi?id=51288
Bug ID: 51288
Summary: Convert mov and shr to shrx in loops constrained by
retirement rate
Product: new-bugs
Version: 12.0
Hardware: PC
OS: Linux
Status: NEW
Severity: enhancement
Priority: P
Component: new bugs
Assignee: unassignedbugs at nondot.org
Reporter: todd at lipcon.org
CC: htmldeveloper at gmail.com, llvm-bugs at lists.llvm.org
This input file:
#include <stdint.h>
#include <utility>
struct Foo {
uint64_t v;
std::pair<uint32_t, uint32_t> Get() { return {v & 0xffffffff, v >> 32}; }
};
void Process(Foo* f, uint32_t* dst, int n) {
#pragma unroll
for (int i = 0; i < n; i++) {
auto [mask, idx] = f[i].Get();
dst[idx] |= mask;
}
}
Generates some assembly where the core of the loop has the following sequence:
movq 24(%rdi,%rax,8), %r9
movq %r9, %rcx
shrq $32, %rcx
orl %r9d, (%rsi,%rcx,4)
When compiling with bmi2 support, it would instead be slightly faster to store
the constant 32 into a register and use shrx to combine the copy of %r9 into
%rcx with a shift.
Generated version:
https://bit.ly/2WzH8Pj
Preferred version (~saving half a cycle per unrolled-by-4 loop):
https://bit.ly/3jaXBBh
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210731/66ea114f/attachment.html>
More information about the llvm-bugs
mailing list