[llvm-bugs] [Bug 43003] New: Use derefenceable info to replace branch with select
via llvm-bugs
llvm-bugs at lists.llvm.org
Thu Aug 15 03:41:07 PDT 2019
https://bugs.llvm.org/show_bug.cgi?id=43003
Bug ID: 43003
Summary: Use derefenceable info to replace branch with select
Product: libraries
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: enhancement
Priority: P
Component: Loop Optimizer
Assignee: unassignedbugs at nondot.org
Reporter: david.bolvansky at gmail.com
CC: llvm-bugs at lists.llvm.org
void foo(int arr[static 1024]) {
for (int i = 0; i < 1024; ++i) {
if (arr[i] < 0)
arr[i] = 0;
}
}
void foo2(int arr[static 1024]) {
for (int i = 0; i < 1024; ++i) {
arr[i] = arr[i] < 0 ? 0 : arr[i];
}
}
void foo3(int arr[1024]) {
for (int i = 0; i < 1024; ++i) {
arr[i] = arr[i] < 0 ? 0 : arr[i];
}
}
void foo4(int arr[1024]) {
for (int i = 0; i < 1024; ++i) {
if (arr[i] < 0)
arr[i] = 0;
}
}
Replace:
if (arr[i] < 0)
arr[i] = 0;
with: arr[i] = arr[i] < 0 ? 0 : arr[i];
If we know that 'i' is always in dereferenceable range of 'arr'.
'dereferenceable' says that we can write to memory (it is not a read only
memory).
in 'foo', we know i32* nocapture dereferenceable(4096) %arr and we know i is in
range <0, 1024), so we should be able to replace branch with a elect.
I am not sure about 'foo4' case, we can do nothing I think since 'arr' could
be read only.
Motivation? Clang -O3 -mavx2
Replace:
foo: # @foo
xor eax, eax
vpxor xmm0, xmm0, xmm0
.LBB0_1: # =>This Inner Loop Header: Depth=1
vmovdqu ymm1, ymmword ptr [rdi + rax]
vmovdqu ymm2, ymmword ptr [rdi + rax + 32]
vmovdqu ymm3, ymmword ptr [rdi + rax + 64]
vmovdqu ymm4, ymmword ptr [rdi + rax + 96]
vpmaskmovd ymmword ptr [rdi + rax], ymm1, ymm0
vpmaskmovd ymmword ptr [rdi + rax + 32], ymm2, ymm0
vpmaskmovd ymmword ptr [rdi + rax + 64], ymm3, ymm0
vpmaskmovd ymmword ptr [rdi + rax + 96], ymm4, ymm0
vmovdqu ymm1, ymmword ptr [rdi + rax + 128]
vmovdqu ymm2, ymmword ptr [rdi + rax + 160]
vmovdqu ymm3, ymmword ptr [rdi + rax + 192]
vmovdqu ymm4, ymmword ptr [rdi + rax + 224]
vpmaskmovd ymmword ptr [rdi + rax + 128], ymm1, ymm0
vpmaskmovd ymmword ptr [rdi + rax + 160], ymm2, ymm0
vpmaskmovd ymmword ptr [rdi + rax + 192], ymm3, ymm0
vpmaskmovd ymmword ptr [rdi + rax + 224], ymm4, ymm0
add rax, 256
cmp rax, 4096
jne .LBB0_1
vzeroupper
ret
With a really nice code:
foo2: # @foo2
xor eax, eax
vpxor xmm0, xmm0, xmm0
.LBB1_1: # =>This Inner Loop Header: Depth=1
vpmaxsd ymm1, ymm0, ymmword ptr [rdi + 4*rax]
vpmaxsd ymm2, ymm0, ymmword ptr [rdi + 4*rax + 32]
vpmaxsd ymm3, ymm0, ymmword ptr [rdi + 4*rax + 64]
vpmaxsd ymm4, ymm0, ymmword ptr [rdi + 4*rax + 96]
vmovdqu ymmword ptr [rdi + 4*rax], ymm1
vmovdqu ymmword ptr [rdi + 4*rax + 32], ymm2
vmovdqu ymmword ptr [rdi + 4*rax + 64], ymm3
vmovdqu ymmword ptr [rdi + 4*rax + 96], ymm4
add rax, 32
cmp rax, 1024
jne .LBB1_1
vzeroupper
ret
I believe simple case like:
void bar(int arr[static 1024]) {
if (arr[80] < 0)
arr[80] = 0;
}
could handle SimplifyCFG and turn branch to select.
Suprisingly:
void bar(int *arr) {
if (arr[80] < 7)
arr[80] = 7;
}
ICC generates cmovge.. ICC ignores the fact that 'arr' could be read only?
The motivation loop case I dont know.. maybe IndVars?
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190815/b09eb853/attachment.html>
More information about the llvm-bugs
mailing list