[PATCH] D75945: Use 15 byte long nops on modern Intel processors
Philip Reames via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 12 17:43:06 PDT 2020
reames added a comment.
I did some quick perf testing on an ivybridge (my old laptop). As expected there's no obvious difference between the 10 byte and 15 byte variants. This is what we'd expect if there's no decoder stall encountered. The only variant notably slower is the trivial 1 byte nop repeated 15 times.
current_nop:
nopw %cs:0L(%rax,%rax,1)
nopl 0(%rax,%rax,1)
ret
proposed_nop:
.rept 5
cs
.endr
nopw %cs:0L(%rax,%rax,1)
ret
prefix_nop:
.rept 14
cs
.endr
nop
ret
repeat_nop:
.rept 15
nop
.endr
ret
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D75945/new/
https://reviews.llvm.org/D75945
More information about the llvm-commits
mailing list