[PATCH] D48028: [X86] Fix NOOP sched overrides on BDW/HSW/SKL.

Clement Courbet via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 12 00:16:13 PDT 2018


courbet added a comment.

In https://reviews.llvm.org/D48028#1128342, @craig.topper wrote:

> The Intel optimization manual talks about nops executing and says that the multi byte nop has an execution dependency on whatever register is encoded in the modrm byte. So that sorta sounds like it uses resources.


I meant issue ports (llvm `ProcResource`s). I went and looked at the optimization manual, that states:

  The one byte NOP:[XCHG EAX,EAX] has special hardware support. Although it still consumes a µop and
  its accompanying resources, the dependence upon the old value of EAX is removed. This µop can be
  executed at the earliest possible opportunity, reducing the number of outstanding instructions and is the
  lowest cost NOP.
  The other NOPs have no special hardware support. Their input and output registers are interpreted by the
  hardware. Therefore, a code generator should arrange to use the register containing the oldest value as
  input, so that the NOP will dispatch and release RS resources at the earliest possible opportunity. 

On the other hand, elsewhere, it says:

  Some micro-ops can execute to completion during rename and are removed from the pipeline at that
  point, effectively costing no execution bandwidth. These include:
  • Zero idioms (dependency breaking idioms).
  • NOP.
  • VZEROUPPER.
  • FXCHG

I guess what it means is that multi-byte NOPs still consume a ROB entry and wait for deata dependencies, but we do measure multi-byte NOPs and see no issue port usage:

F6346196: Screenshot from 2018-06-12 09-02-09.png <https://reviews.llvm.org/F6346196>
F6346200: Screenshot from 2018-06-12 09-02-22.png <https://reviews.llvm.org/F6346200>


Repository:
  rL LLVM

https://reviews.llvm.org/D48028





More information about the llvm-commits mailing list