[llvm-bugs] [Bug 28430] New: [x86-64] Inefficient code generated for "if (Operator *O = dyn_cast<Operator>(V)) { foo(O); }"
via llvm-bugs
llvm-bugs at lists.llvm.org
Tue Jul 5 17:13:38 PDT 2016
https://llvm.org/bugs/show_bug.cgi?id=28430
Bug ID: 28430
Summary: [x86-64] Inefficient code generated for "if (Operator
*O = dyn_cast<Operator>(V)) { foo(O); }"
Product: libraries
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: justin.lebar at gmail.com
CC: llvm-bugs at lists.llvm.org
Classification: Unclassified
I would expect the following two functions to compile to (roughly) the same
x86-64 machine code.
void foo(Operator *O);
void test1(Value *V) {
if (Operator *O = dyn_cast<Operator>(V))
foo(O);
}
void test2(Value *V) {
if (isa<Operator>(V))
foo(cast<Operator>(V));
}
But they don't:
_Z5test1PN4llvm5ValueE:
mov cl, byte ptr [rdi + 24]
cmp cl, 23
seta al
cmp cl, 10
setne cl
test rdi, rdi
je .LBB0_2
xor al, cl
jne .LBB0_2
jmp _Z3fooPN4llvm8OperatorE at PLT # TAILCALL
.LBB0_2:
ret
_Z5test2PN4llvm5ValueE:
mov al, byte ptr [rdi + 24]
cmp al, 23
ja .LBB0_3
cmp al, 10
je .LBB0_3
ret
.LBB0_3:
jmp _Z3fooPN4llvm8OperatorE at PLT # TAILCALL
It seems to me that there are two problems with the code generated for test1:
First, the null check on rdi shouldn't be necessary. The isa call inside
dyn_cast dereferences the pointer, so (I think?) we should be able to assume
that it's not null. (Adding an explicit __builtin_assume to the implementatin
of dyn_cast gets rid of this check.)
Second, although I'm not enough of an x86 expert to say whether the two-jump
version (as in test2) or one-jump version (as in test1) is preferable, since
they do the same thing, presumably we should prefer one over the other, if only
due to code size differences. (I also wouldn't be surprised if there were
performance differences.)
gcc 4.8.4 emits the same code for both functions, and it's basically identical
to the code clang generates for test2:
movzx eax, BYTE PTR 24[rdi]
cmp al, 23
ja .L9
cmp al, 10
je .L9
rep ret
.L9:
jmp _Z3fooPN4llvm8OperatorE at PLT
Here are the -print-after-all -debug logs for the two functions:
test1: https://gist.github.com/fbbabc5299aca5a275b2b38a6d9d217f
test2: https://gist.github.com/24dee5133a003c12b0c127e5e5f2a3c9
Since we use this idiom everywhere in clang and llvm, it seems like a good bet
that we'd get a performance improvement from improving the codegen here.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20160706/75a009f5/attachment.html>
More information about the llvm-bugs
mailing list