[PATCH] D75945: Use 15 byte long nops on modern Intel processors

Philip Reames via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 12 17:43:06 PDT 2020


reames added a comment.

I did some quick perf testing on an ivybridge (my old laptop).  As expected there's no obvious difference between the 10 byte and 15 byte variants.  This is what we'd expect if there's no decoder stall encountered.  The only variant notably slower is the trivial 1 byte nop repeated 15 times.

current_nop:

  nopw %cs:0L(%rax,%rax,1)
  nopl 0(%rax,%rax,1)
  ret

proposed_nop:

  .rept 5
  cs
  .endr
  nopw %cs:0L(%rax,%rax,1)
  ret


prefix_nop:

  .rept 14
  cs
  .endr
  nop
  ret

repeat_nop:

  .rept 15
  nop
  .endr
  ret


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75945/new/

https://reviews.llvm.org/D75945





More information about the llvm-commits mailing list